diff options
Diffstat (limited to 'docs/using_as_module.txt')
-rw-r--r-- | docs/using_as_module.txt | 150 |
1 files changed, 150 insertions, 0 deletions
diff --git a/docs/using_as_module.txt b/docs/using_as_module.txt new file mode 100644 index 0000000..130d0a7 --- /dev/null +++ b/docs/using_as_module.txt @@ -0,0 +1,150 @@ +Using Markdown as Python Library +================================ + +First and foremost, Python-Markdown is intended to be a python library module +used by various projects to convert Markdown syntax into HTML. + +The Basics +---------- + +To use markdown as a module: + + import markdown + html = markdown.markdown(your_text_string) + +Encoded Text +------------ + +Note that ``markdown()`` expects **Unicode** as input (although a simple ASCII +string should work) and returns output as Unicode. Do not pass encoded strings to it! +If your input is encoded, e.g. as UTF-8, it is your responsibility to decode +it. E.g.: + + input_file = codecs.open("some_file.txt", mode="r", encoding="utf-8") + text = input_file.read() + html = markdown.markdown(text, extensions) + +If you later want to write it to disk, you should encode it yourself: + + output_file = codecs.open("some_file.html", "w", encoding="utf-8") + output_file.write(html) + +More Options +------------ + +If you want to pass more options, you can create an instance of the ``Markdown`` +class yourself and then use ``convert()`` to generate HTML: + + import markdown + md = markdown.Markdown( + extensions=['footnotes'], + extension_configs= {'footnotes' : ('PLACE_MARKER','~~~~~~~~')}, + safe_mode=True, + output_format='html4' + ) + return md.convert(some_text) + +You should also use this method if you want to process multiple strings: + + md = markdown.Markdown() + html1 = md.convert(text1) + html2 = md.convert(text2) + +Working with Files +------------------ + +While the Markdown class is only intended to work with Unicode text, some +encoding/decoding is required for the command line features. These functions +and methods are only intended to fit the common use case. + +The ``Markdown`` class has the method ``convertFile`` which reads in a file and +writes out to a file-like-object: + + md = markdown.Markdown() + md.convertFile(input="in.txt", output="out.html", encoding="utf-8") + +The markdown module also includes a shortcut function ``markdownFromFile`` that +wraps the above method. + + markdown.markdownFromFile(input="in.txt", + output="out.html", + extensions=[], + encoding="utf-8", + safe=False) + +In either case, if the ``output`` keyword is passed a file name (i.e.: +``output="out.html"``), it will try to write to a file by that name. If +``output`` is passed a file-like-object (i.e. ``output=StringIO.StringIO()``), +it will attempt to write out to that object. Finally, if ``output`` is +set to ``None``, it will write to ``stdout``. + +Using Extensions +---------------- + +One of the parameters that you can pass is a list of Extensions. Extensions +must be available as python modules either within the ``markdown.extensions`` +package or on your PYTHONPATH with names starting with `mdx_`, followed by the +name of the extension. Thus, ``extensions=['footnotes']`` will first look for +the module ``markdown.extensions.footnotes``, then a module named +``mdx_footnotes``. See the documentation specific to the extension you are +using for help in specifying configuration settings for that extension. + +Note that some extensions may need their state reset between each call to +``convert``: + + html1 = md.convert(text1) + md.reset() + html2 = md.convert(text2) + +Safe Mode +--------- + +If you are using Markdown on a web system which will transform text provided +by untrusted users, you may want to use the "safe_mode" option which ensures +that the user's HTML tags are either replaced, removed or escaped. (They can +still create links using Markdown syntax.) + +* To replace HTML, set ``safe_mode="replace"`` (``safe_mode=True`` still works + for backward compatibility with older versions). The HTML will be replaced + with the text defined in ``markdown.HTML_REMOVED_TEXT`` which defaults to + ``[HTML_REMOVED]``. To replace the HTML with something else: + + markdown.HTML_REMOVED_TEXT = "--RAW HTML IS NOT ALLOWED--" + md = markdown.Markdown(safe_mode="replace") + + **Note**: You could edit the value of ``HTML_REMOVED_TEXT`` directly in + markdown/__init__.py but you will need to remember to do so every time you + upgrade to a newer version of Markdown. Therefore, this is not recommended. + +* To remove HTML, set ``safe_mode="remove"``. Any raw HTML will be completely + stripped from the text with no warning to the author. + +* To escape HTML, set ``safe_mode="escape"``. The HTML will be escaped and + included in the document. + +Output Formats +-------------- + +If Markdown is outputing (X)HTML as part of a web page, most likely you will +want the output to match the (X)HTML version used by the rest of your page/site. +Currently, Markdown offers two output formats out of the box; "HTML4" and +"XHTML1" (the default) . Markdown will also accept the formats "HTML" and +"XHTML" which currently map to "HTML4" and "XHTML" respectively. However, +you should use the more explicit keys as the general keys may change in the +future if it makes sense at that time. The keys can either be lowercase or +uppercase. + +To set the output format do: + + html = markdown.markdown(text, output_format='html4') + +Or, when using the Markdown class: + + md = markdown.Markdown(output_format='html4') + html = md.convert(text) + +Note that the output format is only set once for the class and cannot be +specified each time ``convert()`` is called. If you really must change the +output format for the class, you can use the ``set_output_format`` method: + + md.set_output_format('xhtml1') |