edu.stanford.nlp.util
Class XMLUtils
java.lang.Object
edu.stanford.nlp.util.XMLUtils
public class XMLUtils
- extends Object
Class XMLUtils
- Author:
- Teg Grenager
Field Summary |
static Set |
breakingTags
Block-level HTML tags that are rendered with surrounding line breaks. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
breakingTags
public static final Set breakingTags
- Block-level HTML tags that are rendered with surrounding line breaks.
XMLUtils
public XMLUtils()
stripTags
public static String stripTags(Reader r,
List mapBack,
boolean markLineBreaks)
- Parameters:
r
- the reader to read the XML/HTML frommapBack
- a List of Integers mapping the positions in the result buffer
to positions in the original Reader, will be cleared on receipt
- Returns:
- the String containing the resulting text
isBreaking
public static boolean isBreaking(String tag)
isBreaking
public static boolean isBreaking(XMLUtils.XMLTag tag)
readUntilTag
public static String readUntilTag(Reader r)
throws IOException
- Reads all text up to next XML tag and returns it as a String.
- Returns:
- the String of the text read, which may be empty.
- Throws:
IOException
readAndParseTag
public static XMLUtils.XMLTag readAndParseTag(Reader r)
throws Exception
- Returns:
- the new XMLTag object, or null if couldn't be created
- Throws:
Exception
unescapeStringForXML
public static String unescapeStringForXML(String s)
escapeStringForXML
public static String escapeStringForXML(String s)
- Returns a String in which all of the special characters of XML have been escaped. The resulting String
can be used as text in well-formed XML.
- Parameters:
s
-
- Returns:
escapeTextAroundXMLTags
public static String escapeTextAroundXMLTags(String s)
readTag
public static String readTag(Reader r)
throws IOException
- Reads all text of the XML tag and returns it as a String.
Assumes that a '<' character has already been read.
- Parameters:
r
-
- Returns:
- the String representing the tag, or null if one couldn't be read
- Throws:
IOException
main
public static void main(String[] args)
throws Exception
- Throws:
Exception
parseTag
public static XMLUtils.XMLTag parseTag(String tagString)
throws Exception
- Throws:
Exception
Stanford NLP Group