We are happy to receive submissions of material to include in the library. First, make sure the material matches the following criteria. If so, contact us.
Material must be in the public domain. This means it must have be specifically placed into the public domain by the copyright holder or it must have had an original copyright no later than 1928 in the United States. WE WILL NOT ACCEPT MATERIAL THAT IS UNDER COPYRIGHT. Note the following exception.
Exception: if you are the copyright owner and wish to include your material here, you must have released it to the public domain or else you must agree that it can be used for any purpose so long as it remains unmodified and the copyright is included with it. We will not include material with more complicated restrictions than this.
Material should match the following formatting rules. We might accept material without these rules (our decision), but it will not appear in the library until we have formatted it, and this can take a long time.
Material must be be primarily in English (some original Greek, Hebrew, or Latin might be accepted).
You must indicate the provenance; that is: where you obtained the material (web site, book/magazine name if you scanned it, etc).
Material must be largely unmodified from the original - other than the following formatting rules below and the Text Markup as defined here.
Material should not be overly sectarian or "fringe."
Material must be Biblically relevant and adhere to established protestant hermenuetical principles.
Material should be provided as HTML, PDF, plain text, Open Office, or Word97 for text; BMP, JPEG, or PNG for images; MIDI, MP3, or WAV for audio. Multiple files can be placed into a ZIP archive.
The Erasmus Project makes the final determination as to what it included in the library.
Formatting:
The formatting of erasmus-project text follows some simple HTML layout, with a few modifications. This is detailed in the Source File Layout,
but the purpose is to make the text easy to procedurally transform for use in software. The source file layout document also includes
guidelines for when various tags and directive ought to be used.
Rules for formatting
The overall goal is to be as true as reasonable to the original published version of the work in question. It is the policy of the Erasmus project not to modify or adjust the meaning, interpretation, or offensiveness of any of the material - even for subtle adjustments. However, it is understood that electronic display of material is often much different than the printed page and some changes in layout - but not content - are allowed and encouraged. Specific rules follow:
Paragraphing. As much as possible, the original paragraphing is to be left in-tact. Sometimes, this may not be obvious from the sources - in which case, best guesses are made. However, if it seems equally valid to place a given paragraph break in one of two places, opt for the one which reduces the size of the paragraphs, even though that will create additional paragraphs. If a paragraph terminates with a colon, semicolon, comma, or dash, use a <br> tag to include the following text in the same physical paragraph even though it may appear, visually, as multiple paragraphs.
Of course, if the terminating punctuation is clearly a print, copy, or scan error, it can be replaced with the proper character.
Paragraphs are indicated by one or more of the following source conventions. 1) A <p>, </p> tag pair, 2) A blank between paragraphs, and/or 3) Use of a directive line.
Spelling corrections. Obvious mispellings and typos can be corrected, for instance, "preist" instead of "priest". But alternate spellings (e.g. "color" vs "colour") should be left unchanged. No archaic language or spelling may be "corrected". Nor should corrections be made for anything that could indicate a stylistic or intepretive choice - for instance the use of "he" or "He" in reference to God is left as-is. However, all sentences should begin capital letters unless there is a very good reason not to do so.
Unicode. Use of ASCII codes above 127 is not allowed. If necessary, use UTF-8 formatting. Only 7-bit ASCII, UTF-8, and Biblos Greek formatting are allowed.
Punctuation. Generally, this is left alone. However, it is allowable for opening/closing apostrophes and quotes to be converted to 7-bit ASCII equivalents. Likewise, n-dash, m-dash, and double dash (--) can optionally be converted to a normal dash (-), although it is recommended that a space-dash-space be used if this is done.
Consistency.
Minor inconsistencies in formatting may (and should) be corrected. For instance if a numbered list is numbered inconsistently, this can be corrected. For instance, "1)", "(2)", "(3.)" in the same list could be consistently normalized to just one of those forms.
Words of Christ. There is some minor disagreements about what Bible text to include as words of Jesus for purposes of highlighting/colorization. No Bible text should include specific coloration for text. For words of Christ, the <woc>, </woc> tag pairs should be used to delimit the text. The software using such text can decide whether or not to use it, and what coloration and mechanism is appropriate.
Print vs. Electronic Changes.
Some book formatting is specifically related to the fact that the text is in a printed form. If there is a better layout for electronic display, this should be used. Following are some examples:
Multiple columns (outside of tables) are a way to save space on the printed page but are undesirable in an electronic medium where the width of the display can vary considerably. All multiple column layouts should be converted to a single column.
Indexes and Tables of Content are not to be included as they are specific to page numbers, which have no relation to a "page" of text on a screen, especially when multiple printings of a book may have inconsistent paging between them.
Tables often repeat headers at the top of each print page, but this is not to be included in erasmus project sources. However, table headers MUST use the
tag pairs so that the electronic software using the source can - at its option - keep a heading in view (or repeat it as desired) as the user scrolls through the table.
Use of strings of dots or dashes to help match up table text in different columns should be avoided. Either use table borders, or avoid it altogether and allow the software using the text to help the user match text across columns.
Although page numbers are not to be relied on, page numbers ought to be included in the source using the <page> tag, if they are available. This tag should appear in the text that occurs at the top of the print page that the tag indicates.
Coloration of text/background ought to be avoided unless there is a very good reason to include it. Such a reason would not include different colors for different levels of text, but might be valid in a version of the Bible that colors words to indicate specific attributes of those words.
Footnote indicators. A work may include multiple sets of footnotes, perhaps certain types of notes indicated by letters, and other type of notes indicated by numbers. These sets should be identified and maintained. However, it is often the case that the same values are reused on different pages (for instance, the first footnote indicator on each page being "a". However, since pages can span the screen in a way that allows the same footnote indicator to be visible multiple times, it is best not to assign specific footnote indicators to each footnote, but rather use the #FOOTNOTE_SET directive to define the footnote sets, and then use <footnote> tags which the software can assign the next in a sequence of indicators to. For instance, going from "a" to "z" before going back to "a". Therefore, the specificity of the indicator should not be included in the text, but left up to the software.
Images.
Images should always include a title indicator with the #IMAGE directive, and shedding information if appropriate. Images that are nested within text should use the <div style="float:"> mechanism to float text around the left or right of the images. However, this mechanism should not be used for interlinears. Rather, use the <block>, </block> tag pair to delimit blocks in interlinears.
Other considerations.
Some material may come from sources where textual "tricks" are used to indicate emphasis. For instance all-caps for words. This is usually done for older plain text electronic media. To make the text easier to read and to avoid possible confusion with acronyms, if (and only if) it is clear that the all-caps is used for emphasis, the text should be converted to lowercase and made both bold and italic so that it is clearly highlighted regardless of the font being used to display it.
When to include outline directives: #OUTLINE should be used in the text whereever a corresponding entry exists in a table of contents. For very large sections under a single outline, outline sublevels can be added where there are headings in the text and where it makes sense. For Bibles and other BCV-oriented material without tables of contents, Old Testament and New Testament outline levels should always be included, and Bible books should each have a sub-level under those. Further subdivisions of Bibles using the #OUTLINE directive should be avoided to keep outlines a manageable length. For non-Bibles without tables of contents, outlines should follow the headings used in the text, except where those headings would result in a very deep nesting or a very long outline. For material that is lengthy and without a table of contents or included headings, outline levels can be added where it makes sense - but the text of the outline should not display any bias of interpretation.