Ticket #7996: sitemaps.txt

File sitemaps.txt, 11.7 KB (added by issya, 16 years ago)

sitemaps updated documentation

Line 
1=====================
2The sitemap framework
3=====================
4
5Django comes with a high-level sitemap-generating framework that makes
6creating sitemap_ XML files easy.
7
8.. _sitemap: http://www.sitemaps.org/
9
10Overview
11========
12
13A sitemap is an XML file on your Web site that tells search-engine indexers how
14frequently your pages change and how "important" certain pages are in relation
15to other pages on your site. This information helps search engines index your
16site.
17
18The Django sitemap framework automates the creation of this XML file by letting
19you express this information in Python code.
20
21It works much like Django's `syndication framework`_. To create a sitemap, just
22write a ``Sitemap`` class and point to it in your URLconf_.
23
24.. _syndication framework: ../syndication_feeds/
25.. _URLconf: ../url_dispatch/
26
27Installation
28============
29
30To install the sitemap app, follow these steps:
31
32 1. Add ``'django.contrib.sitemaps'`` to your INSTALLED_APPS_ setting.
33 2. Make sure ``'django.template.loaders.app_directories.load_template_source'``
34 is in your TEMPLATE_LOADERS_ setting. It's in there by default, so
35 you'll only need to change this if you've changed that setting.
36 3. Make sure you've installed the `sites framework`_.
37
38(Note: The sitemap application doesn't install any database tables. The only
39reason it needs to go into ``INSTALLED_APPS`` is so that the
40``load_template_source`` template loader can find the default templates.)
41
42.. _INSTALLED_APPS: ../settings/#installed-apps
43.. _TEMPLATE_LOADERS: ../settings/#template-loaders
44.. _sites framework: ../sites/
45
46Initialization
47==============
48
49To activate sitemap generation on your Django site, add this line to your
50URLconf_::
51
52 (r'^sitemap.xml$', 'django.contrib.sitemaps.views.sitemap', {'sitemaps': sitemaps})
53
54This tells Django to build a sitemap when a client accesses ``/sitemap.xml``.
55
56The name of the sitemap file is not important, but the location is. Search
57engines will only index links in your sitemap for the current URL level and
58below. For instance, if ``sitemap.xml`` lives in your root directory, it may
59reference any URL in your site. However, if your sitemap lives at
60``/content/sitemap.xml``, it may only reference URLs that begin with
61``/content/``.
62
63The sitemap view takes an extra, required argument: ``{'sitemaps': sitemaps}``.
64``sitemaps`` should be a dictionary that maps a short section label (e.g.,
65``blog`` or ``news``) to its ``Sitemap`` class (e.g., ``BlogSitemap`` or
66``NewsSitemap``). It may also map to an *instance* of a ``Sitemap`` class
67(e.g., ``BlogSitemap(some_var)``).
68
69.. _URLconf: ../url_dispatch/
70
71Sitemap classes
72===============
73
74A ``Sitemap`` class is a simple Python class that represents a "section" of
75entries in your sitemap. For example, one ``Sitemap`` class could represent all
76the entries of your weblog, while another could represent all of the events in
77your events calendar.
78
79In the simplest case, all these sections get lumped together into one
80``sitemap.xml``, but it's also possible to use the framework to generate a
81sitemap index that references individual sitemap files, one per section. (See
82`Creating a sitemap index`_ below.)
83
84``Sitemap`` classes must subclass ``django.contrib.sitemaps.Sitemap``. They can
85live anywhere in your codebase.
86
87A simple example
88================
89
90Let's assume you have a blog system, with an ``Entry`` model, and you want your
91sitemap to include all the links to your individual blog entries. Here's how
92your sitemap class might look::
93
94 from django.contrib.sitemaps import Sitemap
95 from mysite.blog.models import Entry
96
97 class BlogSitemap(Sitemap):
98 changefreq = "never"
99 priority = 0.5
100
101 def items(self):
102 return Entry.objects.filter(is_draft=False)
103
104 def lastmod(self, obj):
105 return obj.pub_date
106
107Note:
108
109 * ``changefreq`` and ``priority`` are class attributes corresponding to
110 ``<changefreq>`` and ``<priority>`` elements, respectively. They can be
111 made callable as functions, as ``lastmod`` was in the example.
112 * ``items()`` is simply a method that returns a list of objects. The objects
113 returned will get passed to any callable methods corresponding to a
114 sitemap property (``location``, ``lastmod``, ``changefreq``, and
115 ``priority``).
116 * ``lastmod`` should return a Python ``datetime`` object.
117 * There is no ``location`` method in this example, but you can provide it
118 in order to specify the URL for your object. By default, ``location()``
119 calls ``get_absolute_url()`` on each object and returns the result.
120
121Sitemap class reference
122=======================
123
124A ``Sitemap`` class can define the following methods/attributes:
125
126``items``
127---------
128
129**Required.** A method that returns a list of objects. The framework doesn't
130care what *type* of objects they are; all that matters is that these objects
131get passed to the ``location()``, ``lastmod()``, ``changefreq()`` and
132``priority()`` methods.
133
134``location``
135------------
136
137**Optional.** Either a method or attribute.
138
139If it's a method, it should return the absolute URL for a given object as
140returned by ``items()``.
141
142If it's an attribute, its value should be a string representing an absolute URL
143to use for *every* object returned by ``items()``.
144
145In both cases, "absolute URL" means a URL that doesn't include the protocol or
146domain. Examples:
147
148 * Good: ``'/foo/bar/'``
149 * Bad: ``'example.com/foo/bar/'``
150 * Bad: ``'http://example.com/foo/bar/'``
151
152If ``location`` isn't provided, the framework will call the
153``get_absolute_url()`` method on each object as returned by ``items()``.
154
155``lastmod``
156-----------
157
158**Optional.** Either a method or attribute.
159
160If it's a method, it should take one argument -- an object as returned by
161``items()`` -- and return that object's last-modified date/time, as a Python
162``datetime.datetime`` object.
163
164If it's an attribute, its value should be a Python ``datetime.datetime`` object
165representing the last-modified date/time for *every* object returned by
166``items()``.
167
168``changefreq``
169--------------
170
171**Optional.** Either a method or attribute.
172
173If it's a method, it should take one argument -- an object as returned by
174``items()`` -- and return that object's change frequency, as a Python string.
175
176If it's an attribute, its value should be a string representing the change
177frequency of *every* object returned by ``items()``.
178
179Possible values for ``changefreq``, whether you use a method or attribute, are:
180
181 * ``'always'``
182 * ``'hourly'``
183 * ``'daily'``
184 * ``'weekly'``
185 * ``'monthly'``
186 * ``'yearly'``
187 * ``'never'``
188
189``priority``
190------------
191
192**Optional.** Either a method or attribute.
193
194If it's a method, it should take one argument -- an object as returned by
195``items()`` -- and return that object's priority, as either a string or float.
196
197If it's an attribute, its value should be either a string or float representing
198the priority of *every* object returned by ``items()``.
199
200Example values for ``priority``: ``0.4``, ``1.0``. The default priority of a
201page is ``0.5``. See the `sitemaps.org documentation`_ for more.
202
203.. _sitemaps.org documentation: http://www.sitemaps.org/protocol.html#prioritydef
204
205Shortcuts
206=========
207
208The sitemap framework provides a couple convenience classes for common cases:
209
210``FlatPageSitemap``
211-------------------
212
213The ``django.contrib.sitemaps.FlatPageSitemap`` class looks at all flatpages_
214defined for the current ``SITE_ID`` (see the `sites documentation`_) and
215creates an entry in the sitemap. These entries include only the ``location``
216attribute -- not ``lastmod``, ``changefreq`` or ``priority``.
217
218.. _flatpages: ../flatpages/
219.. _sites documentation: ../sites/
220
221``GenericSitemap``
222------------------
223
224The ``GenericSitemap`` class works with any `generic views`_ you already have.
225To use it, create an instance, passing in the same ``info_dict`` you pass to
226the generic views. The only requirement is that the dictionary have a
227``queryset`` entry. It may also have a ``date_field`` entry that specifies a
228date field for objects retrieved from the ``queryset``. This will be used for
229the ``lastmod`` attribute in the generated sitemap. You may also pass
230``priority`` and ``changefreq`` keyword arguments to the ``GenericSitemap``
231constructor to specify these attributes for all URLs.
232
233.. _generic views: ../generic_views/
234
235Example
236-------
237
238Here's an example of a URLconf_ using both::
239
240 from django.conf.urls.defaults import *
241 from django.contrib.sitemaps import FlatPageSitemap, GenericSitemap
242 from mysite.blog.models import Entry
243
244 info_dict = {
245 'queryset': Entry.objects.all(),
246 'date_field': 'pub_date',
247 }
248
249 sitemaps = {
250 'flatpages': FlatPageSitemap,
251 'blog': GenericSitemap(info_dict, priority=0.6),
252 }
253
254 urlpatterns = patterns('',
255 # some generic view using info_dict
256 # ...
257
258 # the sitemap
259 (r'^sitemap.xml$', 'django.contrib.sitemaps.views.sitemap', {'sitemaps': sitemaps})
260 )
261
262.. _URLconf: ../url_dispatch/
263
264Creating a sitemap index
265========================
266
267The sitemap framework also has the ability to create a sitemap index that
268references individual sitemap files, one per each section defined in your
269``sitemaps`` dictionary. The only differences in usage are:
270
271 * You use two views in your URLconf: ``django.contrib.sitemaps.views.index``
272 and ``django.contrib.sitemaps.views.sitemap``.
273 * The ``django.contrib.sitemaps.views.sitemap`` view should take a
274 ``section`` keyword argument.
275
276Here is what the relevant URLconf lines would look like for the example above::
277
278 (r'^sitemap.xml$', 'django.contrib.sitemaps.views.index', {'sitemaps': sitemaps})
279 (r'^sitemap-(?P<section>.+).xml$', 'django.contrib.sitemaps.views.sitemap', {'sitemaps': sitemaps})
280
281This will automatically generate a ``sitemap.xml`` file that references
282both ``sitemap-flatpages.xml`` and ``sitemap-blog.xml``. The ``Sitemap``
283classes and the ``sitemaps`` dict don't change at all.
284
285If one of your sitemaps is going to have more than 50,000 URLs you should
286create an index file. Your sitemap will be paginated and the index will
287reflect that.
288
289Pinging Google
290==============
291
292After you have initially submitted your sitemap to Google's Webmaster Tools,
293you may want to "ping" Google when your sitemap changes. This will tell them
294to reindex your site. The framework provides a function to do just that:
295``django.contrib.sitemaps.ping_google()``.
296
297``ping_google()`` takes an optional argument, ``sitemap_url``, which should be
298the absolute URL of your site's sitemap (e.g., ``'/sitemap.xml'``). If this
299argument isn't provided, ``ping_google()`` will attempt to figure out your
300sitemap by performing a reverse looking in your URLconf.
301
302``ping_google()`` raises the exception
303``django.contrib.sitemaps.SitemapNotFound`` if it cannot determine your sitemap
304URL.
305
306One useful way to call ``ping_google()`` is from a model's ``save()`` method::
307
308 from django.contrib.sitemaps import ping_google
309
310 class Entry(models.Model):
311 # ...
312 def save(self):
313 super(Entry, self).save()
314 try:
315 ping_google()
316 except Exception:
317 # Bare 'except' because we could get a variety
318 # of HTTP-related exceptions.
319 pass
320
321A more efficient solution, however, would be to call ``ping_google()`` from a
322cron script, or some other scheduled task. The function makes an HTTP request
323to Google's servers, so you may not want to introduce that network overhead
324each time you call ``save()``.
325
326Pinging Google via `manage.py`
327------------------------------
328
329**New in Django development version**
330
331Once the sitemaps application is added to your project, you may also
332ping the Google server's through the command line manage.py interface::
333
334 python manage.py ping_google [/sitemap.xml]
335
Back to Top