Documentation
¶
Index ¶
- Variables
- func Collect(min, max int, input, extension string, level int) (collectedLineCount, filteredLineCount int)
- func FilterLanguage(languages []string)
- func GetMD5Hash(bytes []byte) string
- func NormalizeHtmlEntities()
- func Sentesize(min, max int)
- func Split(use_udpipe, output_lemmas bool)
- func StripHtml()
- func Unique(DEBUG bool)
- type CSS
- type HTML
- type HTMLAttr
- type JS
- type JSStr
- type URL
Constants ¶
This section is empty.
Variables ¶
var PARSER *udpipe.Parser
Functions ¶
func FilterLanguage ¶
func FilterLanguage(languages []string)
func GetMD5Hash ¶
func NormalizeHtmlEntities ¶
func NormalizeHtmlEntities()
Types ¶
type CSS ¶
type CSS string
CSS encapsulates known safe content that matches any of:
- The CSS3 stylesheet production, such as `p { color: purple }`.
- The CSS3 rule production, such as `a[href=~"https:"].foo#bar`.
- CSS3 declaration productions, such as `color: red; margin: 2px`.
- The CSS3 value production, such as `rgba(0, 0, 255, 127)`.
See http://www.w3.org/TR/css3-syntax/#parsing and https://web.archive.org/web/20090211114933/http://w3.org/TR/css3-syntax#style
Use of this type presents a security risk: the encapsulated content should come from a trusted source, as it will be included verbatim in the template output.
type HTML ¶
type HTML string
HTML encapsulates a known safe HTML document fragment. It should not be used for HTML from a third-party, or HTML with unclosed tags or comments. The outputs of a sound HTML sanitizer and a template escaped by this package are fine for use with HTML.
Use of this type presents a security risk: the encapsulated content should come from a trusted source, as it will be included verbatim in the template output.
type HTMLAttr ¶
type HTMLAttr string
HTMLAttr encapsulates an HTML attribute from a trusted source, for example, ` dir="ltr"`.
Use of this type presents a security risk: the encapsulated content should come from a trusted source, as it will be included verbatim in the template output.
type JS ¶
type JS string
JS encapsulates a known safe EcmaScript5 Expression, for example, `(x + y * z())`. Template authors are responsible for ensuring that typed expressions do not break the intended precedence and that there is no statement/expression ambiguity as when passing an expression like "{ foo: bar() }\n['foo']()", which is both a valid Expression and a valid Program with a very different meaning.
Use of this type presents a security risk: the encapsulated content should come from a trusted source, as it will be included verbatim in the template output.
Using JS to include valid but untrusted JSON is not safe. A safe alternative is to parse the JSON with json.Unmarshal and then pass the resultant object into the template, where it will be converted to sanitized JSON when presented in a JavaScript context.
type JSStr ¶
type JSStr string
JSStr encapsulates a sequence of characters meant to be embedded between quotes in a JavaScript expression. The string must match a series of StringCharacters:
StringCharacter :: SourceCharacter but not `\` or LineTerminator
| EscapeSequence
Note that LineContinuations are not allowed. JSStr("foo\\nbar") is fine, but JSStr("foo\\\nbar") is not.
Use of this type presents a security risk: the encapsulated content should come from a trusted source, as it will be included verbatim in the template output.
type URL ¶
type URL string
URL encapsulates a known safe URL or URL substring (see RFC 3986). A URL like `javascript:checkThatFormNotEditedBeforeLeavingPage()` from a trusted source should go in the page, but by default dynamic `javascript:` URLs are filtered out since they are a frequently exploited injection vector.
Use of this type presents a security risk: the encapsulated content should come from a trusted source, as it will be included verbatim in the template output.