Class StringTokenizer
- All Implemented Interfaces:
Enumeration<String>,Iterator<String>
The tokenization method is much simpler than the one used by the
StreamTokenizer class. The StringTokenizer methods
do not distinguish among identifiers, numbers, and quoted strings, nor do
they recognize and skip comments.
The set of delimiters (the characters that separate tokens) may be specified either at creation time or on a per-token basis.
There are two kinds of delimiters: token delimiters and non-token delimiters. A token is either one token delimiter character, or a maximal sequence of consecutive characters that are not delimiters.
A StringTokenizer object internally maintains a current
position within the string to be tokenized. Some operations advance this
current position past the characters processed.
The implementation is not thread safe; if a StringTokenizer
object is intended to be used in multiple threads, an appropriate wrapper
must be provided.
The following is one example of the use of the tokenizer. It also
demonstrates the usefulness of having both token and non-token delimiters in
one StringTokenizer.
The code:
String s = " ( aaa \t * (b+c1 ))";
StringTokenizer tokenizer = new StringTokenizer(s, " \t\n\r\f", "()+*");
while (tokenizer.hasMoreTokens()) {
System.out.println(tokenizer.nextToken());
};
prints the following output:
(
aaa
*
(
b
+
c1
)
)
Compatibility with java.util.StringTokenizer
In the original version of java.util.StringTokenizer, the method
nextToken() left the current position after the returned token,
and the method hasMoreTokens() moved (as a side effect) the
current position before the beginning of the next token. Thus, the code:
String s = "x=a,b,c";
java.util.StringTokenizer tokenizer = new java.util.StringTokenizer(s,"=");
System.out.println(tokenizer.nextToken());
while (tokenizer.hasMoreTokens()) {
System.out.println(tokenizer.nextToken(","));
};
prints the following output:
x
a
b
c
The Java SDK 1.3 implementation removed the undesired side effect of
hasMoreTokens method: now, it does not advance current position.
However, after these changes the output of the above code was:
x
=a
b
c
and there was no good way to produce a second token without "=".
To solve the problem, this implementation introduces a new method
skipDelimiters(). To produce the original output, the above code
should be modified as:
String s = "x=a,b,c";
StringTokenizer tokenizer = new StringTokenizer(s,"=");
System.out.println(tokenizer.nextToken());
tokenizer.skipDelimiters();
while (tokenizer.hasMoreTokens()) {
System.out.println(tokenizer.nextToken(","));
};
- Since:
- ostermillerutils 1.00.00
- Author:
- Stephen Ostermiller https://ostermiller.org/contact.pl?regarding=Java+Utilities
-
Constructor Summary
ConstructorsConstructorDescriptionStringTokenizer(String text) Constructs a string tokenizer for the specified string.StringTokenizer(String text, String nontokenDelims) Constructs a string tokenizer for the specified string.StringTokenizer(String text, String delims, boolean delimsAreTokens) Constructs a string tokenizer for the specified string.StringTokenizer(String text, String nontokenDelims, String tokenDelims) Constructs a string tokenizer for the specified string.StringTokenizer(String text, String nontokenDelims, String tokenDelims, boolean returnEmptyTokens) Constructs a string tokenizer for the specified string. -
Method Summary
Modifier and TypeMethodDescriptionintCalculates the number of times that this tokenizer'snextTokenmethod can be called before it generates an exception.intcountTokens(String delims) Calculates the number of times that this tokenizer'snextTokenmethod can be called before it generates an exception using the given set of (non-token) delimiters.intcountTokens(String delims, boolean delimsAreTokens) Calculates the number of times that this tokenizer'snextTokenmethod can be called before it generates an exception using the given set of delimiters.intcountTokens(String nontokenDelims, String tokenDelims) Calculates the number of times that this tokenizer'snextTokenmethod can be called before it generates an exception using the given set of delimiters.intcountTokens(String nontokenDelims, String tokenDelims, boolean returnEmptyTokens) Calculates the number of times that this tokenizer'snextTokenmethod can be called before it generates an exception using the given set of delimiters.intGet the the index of the character immediately following the end of the last token.booleanReturns the same value as thehasMoreTokens()method.booleanTests if there are more tokens available from this tokenizer's string.booleanhasNext()Returns the same value as thehasMoreTokens()method.next()Returns the same value as thenextToken()method, except that its declared return value isObjectrather thanString.Returns the same value as thenextToken()method, except that its declared return value isObjectrather thanString.Returns the next token from this string tokenizer.Returns the next token in this string tokenizer's string.Returns the next token in this string tokenizer's string.Returns the next token in this string tokenizer's string.Returns the next token in this string tokenizer's string.peek()Returns the same value as nextToken() but does not alter the internal state of the Tokenizer.voidremove()This implementation always throwsUnsupportedOperationException.Retrieves the rest of the text as a single token.voidsetDelimiters(String delims) Set the delimiters used to this set of (non-token) delimiters.voidsetDelimiters(String delims, boolean delimsAreTokens) Set the delimiters used to this set of delimiters.voidsetDelimiters(String nontokenDelims, String tokenDelims) Set the delimiters used to this set of delimiters.voidsetDelimiters(String nontokenDelims, String tokenDelims, boolean returnEmptyTokens) Set the delimiters used to this set of delimiters.voidsetReturnEmptyTokens(boolean returnEmptyTokens) Set whether empty tokens should be returned from this point in in the tokenizing process onward.voidSet the text to be tokenized in this StringTokenizer.booleanAdvances the current position so it is before the next token.String[]toArray()Retrieve all of the remaining tokens in a String array.Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface java.util.Enumeration
asIteratorMethods inherited from interface java.util.Iterator
forEachRemaining
-
Constructor Details
-
StringTokenizer
Constructs a string tokenizer for the specified string. Both token and non-token delimiters are specified.The current position is set at the beginning of the string.
- Parameters:
text- a string to be parsed.nontokenDelims- the non-token delimiters, i.e. the delimiters that only separate tokens and are not returned as separate tokens.tokenDelims- the token delimiters, i.e. delimiters that both separate tokens, and are themselves returned as tokens.- Throws:
NullPointerException- if text is null.- Since:
- ostermillerutils 1.00.00
-
StringTokenizer
public StringTokenizer(String text, String nontokenDelims, String tokenDelims, boolean returnEmptyTokens) Constructs a string tokenizer for the specified string. Both token and non-token delimiters are specified and whether or not empty tokens are returned is specified.Empty tokens are tokens that are between consecutive delimiters.
It is a primary constructor (i.e. all other constructors are defined in terms of it.)
The current position is set at the beginning of the string.
- Parameters:
text- a string to be parsed.nontokenDelims- the non-token delimiters, i.e. the delimiters that only separate tokens and are not returned as separate tokens.tokenDelims- the token delimiters, i.e. delimiters that both separate tokens, and are themselves returned as tokens.returnEmptyTokens- true if empty tokens may be returned; false otherwise.- Throws:
NullPointerException- if text is null.- Since:
- ostermillerutils 1.00.00
-
StringTokenizer
Constructs a string tokenizer for the specified string. Either token or non-token delimiters are specified.Is equivalent to:
- If the third parameter is
false--StringTokenizer(text, delimiters, null) - If the third parameter is
true--StringTokenizer(text, null, delimiters)
- Parameters:
text- a string to be parsed.delims- the delimiters.delimsAreTokens- flag indicating whether the second parameter specifies token or non-token delimiters:false-- the second parameter specifies non-token delimiters, the set of token delimiters is empty;true-- the second parameter specifies token delimiters, the set of non-token delimiters is empty.- Throws:
NullPointerException- if text is null.- Since:
- ostermillerutils 1.00.00
- If the third parameter is
-
StringTokenizer
Constructs a string tokenizer for the specified string. The characters in thenontokenDelimsargument are the delimiters for separating tokens. Delimiter characters themselves will not be treated as tokens.Is equivalent to
StringTokenizer(text,nontokenDelims, null).- Parameters:
text- a string to be parsed.nontokenDelims- the non-token delimiters.- Throws:
NullPointerException- if text is null.- Since:
- ostermillerutils 1.00.00
-
StringTokenizer
Constructs a string tokenizer for the specified string. The tokenizer uses " \t\n\r\f" as a delimiter set of non-token delimiters, and an empty token delimiter set.Is equivalent to
StringTokenizer(text, " \t\n\r\f", null);- Parameters:
text- a string to be parsed.- Throws:
NullPointerException- if text is null.- Since:
- ostermillerutils 1.00.00
-
-
Method Details
-
setText
Set the text to be tokenized in this StringTokenizer.This is useful when for StringTokenizer re-use so that new string tokenizers do not have to be created for each string you want to tokenizer.
The string will be tokenized from the beginning of the string.
- Parameters:
text- a string to be parsed.- Throws:
NullPointerException- if text is null.- Since:
- ostermillerutils 1.00.00
-
hasMoreTokens
public boolean hasMoreTokens()Tests if there are more tokens available from this tokenizer's string. If this method returnstrue, then a subsequent call tonextTokenwith no argument will successfully return a token.The current position is not changed.
- Returns:
trueif and only if there is at least one token in the string after the current position;falseotherwise.- Since:
- ostermillerutils 1.00.00
-
nextToken
Returns the next token from this string tokenizer.The current position is set after the token returned.
- Returns:
- the next token from this string tokenizer.
- Throws:
NoSuchElementException- if there are no more tokens in this tokenizer's string.- Since:
- ostermillerutils 1.00.00
-
skipDelimiters
public boolean skipDelimiters()Advances the current position so it is before the next token.This method skips non-token delimiters but does not skip token delimiters.
This method is useful when switching to the new delimiter sets (see the second example in the class comment.)
- Returns:
trueif there are more tokens,falseotherwise.- Since:
- ostermillerutils 1.00.00
-
countTokens
public int countTokens()Calculates the number of times that this tokenizer'snextTokenmethod can be called before it generates an exception. The current position is not advanced.- Returns:
- the number of tokens remaining in the string using the current delimiter set.
- Since:
- ostermillerutils 1.00.00
- See Also:
-
setDelimiters
Set the delimiters used to this set of (non-token) delimiters.- Parameters:
delims- the new set of non-token delimiters (the set of token delimiters will be empty).- Since:
- ostermillerutils 1.00.00
-
setDelimiters
Set the delimiters used to this set of delimiters.- Parameters:
delims- the new set of delimiters.delimsAreTokens- flag indicating whether the first parameter specifies token or non-token delimiters: false -- the first parameter specifies non-token delimiters, the set of token delimiters is empty; true -- the first parameter specifies token delimiters, the set of non-token delimiters is empty.- Since:
- ostermillerutils 1.00.00
-
setDelimiters
Set the delimiters used to this set of delimiters.- Parameters:
nontokenDelims- the new set of non-token delimiters.tokenDelims- the new set of token delimiters.- Since:
- ostermillerutils 1.00.00
-
setDelimiters
Set the delimiters used to this set of delimiters.- Parameters:
nontokenDelims- the new set of non-token delimiters.tokenDelims- the new set of token delimiters.returnEmptyTokens- true if empty tokens may be returned; false otherwise.- Since:
- ostermillerutils 1.00.00
-
countTokens
Calculates the number of times that this tokenizer'snextTokenmethod can be called before it generates an exception using the given set of (non-token) delimiters. The delimiters given will be used for future calls to nextToken() unless new delimiters are given. The current position is not advanced.- Parameters:
delims- the new set of non-token delimiters (the set of token delimiters will be empty).- Returns:
- the number of tokens remaining in the string using the new delimiter set.
- Since:
- ostermillerutils 1.00.00
- See Also:
-
countTokens
Calculates the number of times that this tokenizer'snextTokenmethod can be called before it generates an exception using the given set of delimiters. The delimiters given will be used for future calls to nextToken() unless new delimiters are given. The current position is not advanced.- Parameters:
delims- the new set of delimiters.delimsAreTokens- flag indicating whether the first parameter specifies token or non-token delimiters: false -- the first parameter specifies non-token delimiters, the set of token delimiters is empty; true -- the first parameter specifies token delimiters, the set of non-token delimiters is empty.- Returns:
- the number of tokens remaining in the string using the new delimiter set.
- Since:
- ostermillerutils 1.00.00
- See Also:
-
countTokens
Calculates the number of times that this tokenizer'snextTokenmethod can be called before it generates an exception using the given set of delimiters. The delimiters given will be used for future calls to nextToken() unless new delimiters are given. The current position is not advanced.- Parameters:
nontokenDelims- the new set of non-token delimiters.tokenDelims- the new set of token delimiters.- Returns:
- the number of tokens remaining in the string using the new delimiter set.
- Since:
- ostermillerutils 1.00.00
- See Also:
-
countTokens
Calculates the number of times that this tokenizer'snextTokenmethod can be called before it generates an exception using the given set of delimiters. The delimiters given will be used for future calls to nextToken() unless new delimiters are given. The current position is not advanced.- Parameters:
nontokenDelims- the new set of non-token delimiters.tokenDelims- the new set of token delimiters.returnEmptyTokens- true if empty tokens may be returned; false otherwise.- Returns:
- the number of tokens remaining in the string using the new delimiter set.
- Since:
- ostermillerutils 1.00.00
- See Also:
-
nextToken
Returns the next token in this string tokenizer's string.First, the sets of token and non-token delimiters are changed to be the
tokenDelimsandnontokenDelims, respectively. Then the next token (with respect to new delimiters) in the string after the current position is returned.The current position is set after the token returned.
The new delimiter sets remains the used ones after this call.
- Parameters:
nontokenDelims- the new set of non-token delimiters.tokenDelims- the new set of token delimiters.- Returns:
- the next token, after switching to the new delimiter set.
- Throws:
NoSuchElementException- if there are no more tokens in this tokenizer's string.- Since:
- ostermillerutils 1.00.00
- See Also:
-
nextToken
Returns the next token in this string tokenizer's string.First, the sets of token and non-token delimiters are changed to be the
tokenDelimsandnontokenDelims, respectively; and whether or not to return empty tokens is set. Then the next token (with respect to new delimiters) in the string after the current position is returned.The current position is set after the token returned.
The new delimiter set remains the one used for this call and empty tokens are returned in the future as they are in this call.
- Parameters:
nontokenDelims- the new set of non-token delimiters.tokenDelims- the new set of token delimiters.returnEmptyTokens- true if empty tokens may be returned; false otherwise.- Returns:
- the next token, after switching to the new delimiter set.
- Throws:
NoSuchElementException- if there are no more tokens in this tokenizer's string.- Since:
- ostermillerutils 1.00.00
- See Also:
-
nextToken
Returns the next token in this string tokenizer's string.Is equivalent to:
- If the second parameter is
false--nextToken(delimiters, null) - If the second parameter is
true--nextToken(null, delimiters)
- Parameters:
delims- the new set of token or non-token delimiters.delimsAreTokens- flag indicating whether the first parameter specifies token or non-token delimiters:false-- the first parameter specifies non-token delimiters, the set of token delimiters is empty;true-- the first parameter specifies token delimiters, the set of non-token delimiters is empty.- Returns:
- the next token, after switching to the new delimiter set.
- Throws:
NoSuchElementException- if there are no more tokens in this tokenizer's string.- Since:
- ostermillerutils 1.00.00
- See Also:
- If the second parameter is
-
nextToken
Returns the next token in this string tokenizer's string.Is equivalent to
nextToken(delimiters, null).- Parameters:
nontokenDelims- the new set of non-token delimiters (the set of token delimiters will be empty).- Returns:
- the next token, after switching to the new delimiter set.
- Throws:
NoSuchElementException- if there are no more tokens in this tokenizer's string.- Since:
- ostermillerutils 1.00.00
- See Also:
-
hasMoreElements
public boolean hasMoreElements()Returns the same value as thehasMoreTokens()method. It exists so that this class can implement theEnumerationinterface.- Specified by:
hasMoreElementsin interfaceEnumeration<String>- Returns:
trueif there are more tokens;falseotherwise.- Since:
- ostermillerutils 1.00.00
- See Also:
-
nextElement
Returns the same value as thenextToken()method, except that its declared return value isObjectrather thanString. It exists so that this class can implement theEnumerationinterface.- Specified by:
nextElementin interfaceEnumeration<String>- Returns:
- the next token in the string.
- Throws:
NoSuchElementException- if there are no more tokens in this tokenizer's string.- Since:
- ostermillerutils 1.00.00
- See Also:
-
hasNext
public boolean hasNext()Returns the same value as thehasMoreTokens()method. It exists so that this class can implement theIteratorinterface. -
next
Returns the same value as thenextToken()method, except that its declared return value isObjectrather thanString. It exists so that this class can implement theIteratorinterface.- Specified by:
nextin interfaceIterator<String>- Returns:
- the next token in the string.
- Throws:
NoSuchElementException- if there are no more tokens in this tokenizer's string.- Since:
- ostermillerutils 1.00.00
- See Also:
-
remove
public void remove()This implementation always throwsUnsupportedOperationException. It exists so that this class can implement theIteratorinterface.- Specified by:
removein interfaceIterator<String>- Throws:
UnsupportedOperationException- always is thrown.- Since:
- ostermillerutils 1.00.00
- See Also:
-
setReturnEmptyTokens
public void setReturnEmptyTokens(boolean returnEmptyTokens) Set whether empty tokens should be returned from this point in in the tokenizing process onward.Empty tokens occur when two delimiters are next to each other or a delimiter occurs at the beginning or end of a string. If empty tokens are set to be returned, and a comma is the non token delimiter, the following table shows how many tokens are in each string.
Token counts by string type String Number of tokens "one,two" 2 - normal case with no empty tokens. "one,,three" 3 including the empty token in the middle. "one," 2 including the empty token at the end. ",two" 2 including the empty token at the beginning. "," 2 including the empty tokens at the beginning and the ends. "" 1 - all strings will have at least one token if empty tokens are returned. - Parameters:
returnEmptyTokens- true iff empty tokens should be returned.- Since:
- ostermillerutils 1.00.00
-
getCurrentPosition
public int getCurrentPosition()Get the the index of the character immediately following the end of the last token. This is the position at which this tokenizer will begin looking for the next token when anextToken()method is invoked.- Returns:
- the current position or -1 if the entire string has been tokenized.
- Since:
- ostermillerutils 1.00.00
-
toArray
Retrieve all of the remaining tokens in a String array. This method uses the options that are currently set for the tokenizer and will advance the state of the tokenizer such thathasMoreTokens()will return false.- Returns:
- an array of tokens from this tokenizer.
- Since:
- ostermillerutils 1.00.00
-
restOfText
Retrieves the rest of the text as a single token. After calling this method hasMoreTokens() will always return false.- Returns:
- any part of the text that has not yet been tokenized.
- Since:
- ostermillerutils 1.00.00
-
peek
Returns the same value as nextToken() but does not alter the internal state of the Tokenizer. Subsequent calls to peek() or a call to nextToken() will return the same token again.- Returns:
- the next token from this string tokenizer.
- Throws:
NoSuchElementException- if there are no more tokens in this tokenizer's string.- Since:
- ostermillerutils 1.00.00
-