Contributing to Scrapy

重要

Double check you are reading the most recent version of this document athttp://doc.scrapy.org/en/master/contributing.html

There are many ways to contribute to Scrapy. Here are some of them:

  • Blog about Scrapy. Tell the world how you’re using Scrapy. This will helpnewcomers with more examples and the Scrapy project to increase itsvisibility.
  • Report bugs and request features in the issue tracker, trying to followthe guidelines detailed in Reporting bugs below.
  • Submit patches for new functionality and/or bug fixes. Please readWriting patches and Submitting patches below for details on how towrite and submit a patch.
  • Join the scrapy-users mailing list and share your ideas on how toimprove Scrapy. We’re always open to suggestions.

Reporting bugs

注解

Please report security issues only toscrapy-security@googlegroups.com. This is a private list only open totrusted Scrapy developers, and its archives are not public.

Well-written bug reports are very helpful, so keep in mind the followingguidelines when reporting a new bug.

  • check the FAQ first to see if your issue is addressed in awell-known question
  • check the open issues to see if it has already been reported. If it has,don’t dismiss the report but check the ticket history and comments, you mayfind additional useful information to contribute.
  • search the scrapy-users list to see if it has been discussed there, orif you’re not sure if what you’re seeing is a bug. You can also ask in the#scrapy IRC channel.
  • write complete, reproducible, specific bug reports. The smaller the testcase, the better. Remember that other developers won’t have your project toreproduce the bug, so please include all relevant files required to reproduceit.
  • include the output of scrapy version -v so developers working on your bugknow exactly which version and platform it occurred on, which is often veryhelpful for reproducing it, or knowing if it was already fixed.

Writing patches

The better written a patch is, the higher chance that it’ll get accepted andthe sooner that will be merged.

Well-written patches should:

  • contain the minimum amount of code required for the specific change. Smallpatches are easier to review and merge. So, if you’re doing more than onechange (or bug fix), please consider submitting one patch per change. Do notcollapse multiple changes into a single patch. For big changes consider usinga patch queue.
  • pass all unit-tests. See Running tests below.
  • include one (or more) test cases that check the bug fixed or the newfunctionality added. See Writing tests below.
  • if you’re adding or changing a public (documented) API, please includethe documentation changes in the same patch. See Documentation policiesbelow.

Submitting patches

The best way to submit a patch is to issue a pull request on Github,optionally creating a new issue first.

Remember to explain what was fixed or the new functionality (what it is, whyit’s needed, etc). The more info you include, the easier will be for coredevelopers to understand and accept your patch.

You can also discuss the new functionality (or bug fix) before creating thepatch, but it’s always good to have a patch ready to illustrate your argumentsand show that you have put some additional thought into the subject. A goodstarting point is to send a pull request on Github. It can be simple enough toillustrate your idea, and leave documentation/tests for later, after the ideahas been validated and proven useful. Alternatively, you can send an email toscrapy-users to discuss your idea first.

Finally, try to keep aesthetic changes (PEP 8 compliance, unused importsremoval, etc) in separate commits than functional changes. This will make pullrequests easier to review and more likely to get merged.

Coding style

Please follow these coding conventions when writing code for inclusion inScrapy:

  • Unless otherwise specified, follow PEP 8.
  • It’s OK to use lines longer than 80 chars if it improves the codereadability.
  • Don’t put your name in the code you contribute. Our policy is to keepthe contributor’s name in the AUTHORS file distributed with Scrapy.

Scrapy Contrib

Scrapy contrib shares a similar rationale as Django contrib, which is explainedin this post. If youare working on a new functionality, please follow that rationale to decidewhether it should be a Scrapy contrib. If unsure, you can ask inscrapy-users.

Documentation policies

  • Don’t use docstrings for documenting classes, or methods which arealready documented in the official (sphinx) documentation. For example, theItemLoader.add_value() method should be documented in the sphinxdocumentation, not its docstring.
  • Do use docstrings for documenting functions not present in the official(sphinx) documentation, such as functions from scrapy.utils package andits sub-modules.

Tests

Tests are implemented using the Twisted unit-testing framework, runningtests requires tox.

Running tests

Make sure you have a recent enough tox installation:

tox —version

If your version is older than 1.7.0, please update it first:

pip install -U tox

To run all tests go to the root directory of Scrapy source code and run:

tox

To run a specific test (say tests/test_loader.py) use:

tox tests/test_loader.py

Writing tests

All functionality (including new features and bug fixes) must include a testcase to check that it works as expected, so please include tests for yourpatches if you want them to get accepted sooner.

Scrapy uses unit-tests, which are located in the tests/ directory.Their module name typically resembles the full path of the module they’retesting. For example, the item loaders code is in:

  1. scrapy.loader

And their unit-tests are in:

  1. tests/test_loader.py