Google's Summer of Code 2014
The application process for 2014 Google Summer of Code is open, and Django has formally applied to be a mentor organization (Read Google's page for more information on how the program works.). If past results are an indication of the future, it's quite likely we'll be accepted as a mentor organization. This page is a placeholder for ideas in the meantime, since we seem to have a lot of eager people wanting to get started!
Django's GSoC program is being run by Tim Graham.
Mentors
If you're interested in mentoring -- supervising a student in work on Django-related activities -- add your name, email, and the sort of projects you're interested in mentoring here:
- Tim Graham (timograham@…) - TBA
- Marc Tamlyn (marc.tamlyn@…) - test suite improvements
- Russell Keith-Magee (russell@…) - Meta refactor, Reducing coupling
Students
Student application period opens March 10 ends on March 21.
If you'd like to get started on your proposal early, we'll be looking for a few things. Note that we've widened our project scope this year!
- You'll need to have a concrete task in mind (some ideas are below) along with a solid idea of what will constitute "success" (you tell us).
- If your proposal is a single large feature, library or site, you'll need to present a detailed design specification. This proposal should be posted to django-developers, where it can be refined until it is accepted by the developer community.
- We'll want to know a bit about you -- links to previous work are great, if any. If you're proposing something ambitious, you'll need to convince us that you're up to the task.
- You'll also need to provide us with a schedule, including a detailed work breakdown and major milestones so your mentor can know if and when to nag you :)
Here's an example of an accepted proposal from last year:
Note that none of the ideas below are good enough to be submissions in their own right (so don't copy and paste)! We'll want to know not just what you want to do but how you plan to pull it off.
Don't feel limited to the ideas below -- if you've got a cool project you want to work on, we'll probably be able to find you a mentor. We plan on approving as many projects as we possibly can.
We're accepting any GSOC proposal that fits one of the following three categories:
- Work on Django itself - such as the ORM, forms, etc. This is what we've traditionally accepted GSoC entries in.
- Work on tools to support Django - the dashboard (https://dashboard.djangoproject.com/) is a good example of an existing tool that would have fitted into this category.
- Work on libraries that supplement or add new features to Django to ease development - South and Django Debug Toolbar are good examples of existing projects that would have fitted here.
We're not looking for people to work on existing third-party libraries - we aren't able to guarantee commit access to them. We may allow an exception if a maintainer of the library in question agrees to help mentor beforehand.
The broadening in scope is to allow people to work on new ideas to help Django development and developers without tying you down to having to implement it in the core codebase (and thus ruling out some projects that might otherwise be useful).
We're still going to be strict with what we accept - you'll need to provide a strong use case for your idea and show that it would be useful to a majority of developers or significantly improve the development of Django itself.
We're not looking for small groups of incremental updates - like "improve Django's Trac" - nor are we looking for impossible tasks, like "replace Trac with this brand new issue tracker I'm writing". What you propose should be a single project, achievable within the time period of GSoC, and something the core developers can help mentor you on.
We're also not looking for sites or projects that are merely written in Django - this GSoC is not for you to propose your new forum hosting site or amazing Django-based blogging engine.
Note that when you contribute code, you will be expected to adhere to the same contribution guidelines as any other code contributor. This means you will be expected to provide extensive tests and documentation for any feature you add, you will be expected to participate in discussion on django-developers when your topic of interest is raised. If you're not already familiar with Django's contribution guidelines, now would be a good time to read them - even if you're not applying to work on Django core directly, we'll still want the same level of contribution.
Communication
This year we're doing all GSOC-related communication via the django-developers mailing list. Any proposals for GSOC should be submitted there, as well as discussion on the proposed projects and any updates that students post.
Please be careful to keep content to the list clear and purposeful; if you have an idea, update, or criticism, please make sure you describe it in detail; it can be tedious asking people to clarify any vague statements, or having vital information drip-fed.
Ideas
Here are some suggestions for projects students may want to propose (please feel free add to this list!). This isn't by any means the be-all and end-all of ideas; please feel free to submit proposals for things not on this list. Remember, we'd much prefer that you posted a draft proposal and your rough timeline / success conditions to the django-developers list, even if it's already on the list below; it will help you get feedback on choosing the right part of a problem, as well as helping to see if there is any interest before you start drafting a full proposal.
When developing your proposal, try to scope ideas/proposals to the 4-month timeline -- simply proposing to fix a ticket or two will probably result in your proposal being rejected in favor of a more ambitious one. The GSoC does not cover activities other than coding, so certain ideas ("Write a more detailed tutorial" or "Create demonstration screencasts" or "Add a pony") are not suitable for inclusion here.
On the other side, though, be sure to be concrete in your proposal. We'll want to know what your goals are, and how you plan to accomplish them.
In no particular order:
Best practices updates
- Complexity: Moderate
Over the years, as Django has evolved, the idea of what constitutes "best practice" has also evolved. However, some parts of Django haven't kept up with those best practices. For example, contrib.comments and contrib.databrowse aren't deployable apps in the same sense as contrib.admin. As a result, these apps can't be (easily) deployed multiple times, and they can't use URL namespacing.
In addition, some features of Django's core have grown and evolved, and need refactoring. For example, validation is now performed in several places, but don't operate by hooking into the core 'validate' command. In addition, many aspects of the core validate command should be farmed out to the things that are being validated (e.g., the max/min conditions on a field should be validated by the field, not by a third party validator).
In short, Django has been bad at eating it's own dogfood. The contents of contrib should be audited and updated to make sure it meets current best practices.
Issues to consider:
- What components need to be updated, and why?
- How to do this update while maintaining backwards compatibility?
See also:
Test framework cleanup
- Complexity: Low
Django has an extensive test framework for Python code, a suite of tools to make server-side testing easier, and a project policy that no new code is added without tests. This has been a significant contributor to the stability of Django as a project.
For the 1.4 release, we also included the basis of a client-side testing framework into Django (https://docs.djangoproject.com/en/dev/topics/testing/#django.test.LiveServerTestCase)
However, this now means that Django has a very large and powerful test suite without much separation or control from a user's perspective, so the goal of this project would be to add new options and suite types to allow running of specific types of tests, be they only a certain class (e.g. unit-tests only) or excluding tests (such as the ones in contrib or third-party apps) from the main test run easily.
Issues to consider:
- How would users declare which tests they want to run?
- Which tests should be enabled by default, and how hard should this be to change?
- How will it be app maintainers run their tests?
- Should there be additional hooks to, for example, allow tests to be run against different database backends in sequence?
See also:
- #13873 (more of a symptom of this problem)
- More tickets need to be added here
Security Enhancements
- Complexity: Medium
Django has developed many security features over time. The existing set of security features is pretty good, but there's lots of room for improvement. Much of the work in this project will be related to cleaning up existing code to make it more obviously secure, eliminate edge cases, and and improve fallback handling.
Some potential areas of work include:
- Enhancing CSRF protection (#16859)
- Centralizing randomized token issuance and validation
- Integrating carljm's django-secure project (https://github.com/carljm/django-secure)
- Building an interactive admin dashboard to display and check installation security parameters
- Targeted Code audit for a specific list of security errors
While an interest in security will make these tasks more interesting, most of them don't require you to be a security expert already. Your mentor will make sure your plan is correct before you code, and carefully review your work before it is committed to trunk. Most of these tasks will be significantly easier if you already have some familiarity with Django's codebase. A successful application will have a plan which selects related areas of work, provides details, and has a good estimation of complexity for the proposed tasks. Remember that (especially for security work) a good patch often has more lines of tests than code changes. An ideal applicant will be able to demonstrate the skill with Python and attention to detail necessary to make fundamental changes to Django without breaking existing code.
Ideas that will probably not be accepted:
- Adding database or cookie encryption support (unless you can provide a secondary mentor who is a crypto expert)
- Proposals that strongly couple sessions with CSRF or Auth
- Proposals to include external libraries in Django
If you are interested in working on this project, please talk to us sooner rather than later! PaulM is usually available on IRC, and wants to help you write a really good application.
Improved error reporting
- Complexity: Medium
The error messages raised by Django can sometimes be confusing or misleading. This is sometimes due to Django wrapping and re-raising errors when it shouldn't. Sometimes it's due to Django not displaying error information effectively. Sometimes it's simply a matter of not catching the right errors.
This should be fixed. Error messages are just as important to the development process as good documentation. This project would address the error reporting issues in Django to ensure that the errors reported by a Django project are as good as they can be.
Issues to consider:
- Import errors discovered during application loading during can be masked under certain circumstances.
- Errors in template tags and filters rarely produce helpful error messages.
- Errors in ModelForm and ModelAdmin can raise errors that don't indicate the real problem
See also:
- The Better Error Messages proposal page
- Ticket #3349
Improve annotation and aggregation
- Complexity: Medium
The 2009 Summer of Code added the annotate() and aggregate() calls to Django's query arsenal. While these tools work well for simple arithmetic aggregates, they don't work well for date and string based queries. There are also use cases where you may want to annotate data onto a model that *isn't* an aggregate (for example, annotating the sum of two other aggregates).
This project would continue where the 2009 GSoC aggregation project left off. This would be an excellent project for anyone wishing to gain an intimate understanding of Django's Query infrastructure.
Issues to consider:
- String concatenation and manipulation (e.g., annotate a model with the uppercase version of the first 5 characters of someone's name)
- Grouping of results by date (e.g., show me a count of articles, grouped by day)
- Allowing non-null defaults in aggregation (e.g., when a model has no related objects, use 0 not NULL)
- Aggregates involving generic relations
See also:
- Trac's list of ORM aggregation tickets
- The [source:django/trunk/django/db/query.py Django's QuerySet implementation]
Finishing off Form Templates
- Complexity: Hard
Two years ago, a GSOC project worked on replacing the internal Django code that renders forms with a templated system, allowing for much better flexibility and customisability of forms, fields, and related components in the forms framework. The current code can be found here: https://github.com/gregmuellegger/django/commits/soc2011/form-rendering
The main issue with the branch last year was that the template renderer was not fast enough on large numbers of includes and extends, meaning that the new form templates, while modular, were slower than the current forms system. The major task with this project would be to address the speed issue. Note that this was attempted last year, and didn't work out so well - if you want to take on this project, we'll want to see a clear plan of how you attempt to address the issue and some proof that you're capable of pulling it off.
Formalizing the Meta object
- Complexity: Medium
Every Django object has a _meta
attribute that contains lots of useful introspection metadata about the object, its fields, and its relationships with other models. This _meta
object is essential to the operation of Django itself, providing the framework for features like ModelForms, and the automatic behaviour of the Admin.
However, officially, the Meta object is not a stable API - it's an internal. This is for two reasons -- firstly, nobody has ever bothered to write the documentation, and secondly, because the API that Meta currently provides is a little bit of a mess -- there's duplicated functionality, some strange caching behaviour, and some other interesting quirks.
It would be desirable to be able to formally publish a stable API spec for Meta. Many projects already rely on Meta as if it were stable API. This would involve auditing the current contents of Meta, doing some API cleanup to provide a consistent and clean API, and documenting that API. Issues of backwards compatibility need to be kept in mind -- even though Meta isn't formally covered by Django's backwards compatibility policy, it's a 'defacto' standard, so we can't just change it arbitrarily.
Reducing coupling in Django components
- Complexity: Hard
Django is currently delivered as a single monolithic download. This is largely for historical reasons; when Django started as a project, Python's packaging infrastructure wasn't especially mature. This situation has has improved over the years, and now Python has a rich set of packaging tools. In an ideal world, it would be possible to download "Just the Django template engine", or "Just the Django ORM", or "Just the Django forms layer"; the combined Django download would really be a meta-package install of all the required parts.
There are two technical problems that need to be solved in order to make this happen.
- Implement the packaging definitions to allow for multiple packages.
- Clean up dependencies between components. Despite the best of intentions, there are some interesting dependencies between modules, some of which may need to be clarified or separated.
The aim of this project would be to clean up one or more of Django's internal "parts" so that it could be delivered as a standalone package. This may not be something that can be immediately delivered - for example, it may be necessary to move or rename components to enable separate packaging. In this case, the project deliverable would be to document the strategy, and provide whatever initial moves in that direction are possible.
A simpler version of this project would be to enable separate packaging and distribution of Django's contrib apps.
Improving the less popular database backends
- Complexity: Medium
Django supports several database backends, but not equally. The less popular backends -- Oracle in core, as well as open-source backends outside core, could probably use some love. As an example, Oracle has three major problems:
- The GIS backend is broken (it does not pass Django's own test-suite)
- Python 3 is not supported (this is a problem with the Python driver, requiring C programming)
- The handling of case in database object names is problematic (e.g. #20487)
While these alone would not fill an agenda for a full GSoC project, an interested student could collect enough related issues -- perhaps in more than one backend -- to keep busy for the whole term.
Keep in mind that for working on 3rd-party backends, a committer for the relevant backend will probably need to be involved in mentoring; however, given such involvement, Django will accept such GSoC projects.
See also:
- Trac's list of Oracle issues
- Similar queries for 3rd-party backends should be added here