SoFunction
Updated on 2025-04-13

A brief analysis of the source code of Prototype, HTML string processing in String part (three)

HTML processing stripTags  | escapeHTML |  unescapeHTML
   
JSON processing unfilterJSON |  isJSON |  evalJSON |  parseJSON
Script processing stripScripts |  extractScripts  | evalScripts
Now, the String part is transferred to specific related applications, corresponding to each other
HTML strings, JSON strings, and script strings in HTML.
[A random sentence, you can take a look at /TomXu/archive/2012/01/11/]
The following are described separately:
1. HTML string
stripTags : Removes all HTML tags from the string.
escapeHTML: Convert HTML special characters to their equivalent entities. (&correspondence&amp; <correspondence&lt; >correspondence&gt;)
unescapeHTML: Removes tags from strings and converts HTML special characters represented by entities into their normal form. (Inverse operation of escapeHTML)
A regular /<\w+(\s+(\s+("[^"]*"|'[^']*'|[^>])+) in stripTags?>|<\/\w+>/gi is used to match the content in the tag. Be careful not to wrap lines, but if you wrap lines, there will be syntax errors.
[The only thing that needs to be paid attention to in this method is that stripTags will remove the <script> tag, but will not remove the content inside, so the content inside <script> may be exposed, affecting the page structure]
2. Script string
stripScripts: Removes all HTML script blocks in the string. Make up for the defects of stripTags method on script tags
extractScripts: Extract all script contents contained in a string and return it as a string array.
evalScripts: Execute the contents of all script blocks contained in the string. Returns an array containing the values ​​returned after each script is executed.
Regularity in stripScripts is the development of a regularity in stripTags
Copy the codeThe code is as follows:

function stripScripts() {
var pattern = new RegExp('<script[^>]*>([\\S\\s]*?)<\/script>', 'img');//i ignore case, m line break, g global
return (pattern , '');
}

Copy the codeThe code is as follows:

function extractScripts() {
var matchAll = new RegExp('<script[^>]*>([\\S\\s]*?)<\/script>', 'img'),
matchOne = new RegExp('<script[^>]*>([\\S\\s]*?)<\/script>', 'im');
return ((matchAll) || []).map(function(scriptTag) {
return ((matchOne) || ['', ''])[1];
});
}

Map is an extension to arrays. Some browsers have this native method, see "Array of Chrome Native Methods"
Finally, the one obtained is an array of all the contents inside the script tags, so the practice of evalScripts is naturally thought out - loop through the obtained array, and then execute (eval) in turn to store the results of each execution.
Copy the codeThe code is as follows:

function evalScripts() {
return ().map(function(script) { return eval(script) });
}

3. JSON processing
unfilterJSON: Removes security comment delimiters around Ajax JSON or JavaScript response content.
isJSON: Use regular expressions to detect whether a string is legal JSON format
evalJSON: Execute a string in JSON format and return the result object
Among them isJSON and evalJSON are parseJSON, and the code is similar, see "Parasing JSON from Strings"
By the way, the security comment delimiter in unfilterJSON is a security mechanism. For your own data, special characters (delimiters) can be added to both ends of the return value to indicate the source of the data. When the client parses, unfilterJSON is used to deal with the added delimiter, which can reduce some XSS attacks to a certain extent.
The default form in Prototype is:
'/*-secure-\n{"name": "Xiaoxishanzi","age": 24}\n*/'
Where the defining symbols are /*-secure-\n' and '\n*/'