• Post Reply Bookmark Topic Watch Topic
  • New Topic
programming forums Java Mobile Certification Databases Caching Books Engineering Micro Controllers OS Languages Paradigms IDEs Build Tools Frameworks Application Servers Open Source This Site Careers Other Pie Elite all forums
this forum made possible by our volunteer staff, including ...
Marshals:
  • Campbell Ritchie
  • Jeanne Boyarsky
  • Ron McLeod
  • Paul Clapham
  • Liutauras Vilda
Sheriffs:
  • paul wheaton
  • Rob Spoor
  • Devaka Cooray
Saloon Keepers:
  • Stephan van Hulst
  • Tim Holloway
  • Carey Brown
  • Frits Walraven
  • Tim Moores
Bartenders:
  • Mikalai Zaikin

Why aren't 'new' and '.' in the operator precedence table?

 
Ranch Foreman
Posts: 626
2
  • Number of slices to send:
    Optional 'thank-you' note:
I was surprised to find that 'new' and '.'  aren't listed here
https://docs.oracle.com/javase/tutorial/java/nutsandbolts/operators.html

Since this compiles, new has greater precedence than '.'
       

But if there were to be two new rows for them in relation to the other operators, where would they be put - at the top?
I vaguely remember that C++ used to have these in its precedence table.



 
Saloon Keeper
Posts: 27851
196
  • 1
  • Number of slices to send:
    Optional 'thank-you' note:
Good question, actually. But that table is not only incomplete, it's outdated.

This one is better: https://introcs.cs.princeton.edu/java/11precedence/
 
Master Rancher
Posts: 4905
74
  • Number of slices to send:
    Optional 'thank-you' note:
Unfortunately, that Princeton table is incorrect.  At least, it's wrong in its placement of "member access" at the top, at the same level as parentheses and array access.  Because member access would include field access and method access, and as Anil just showed, those are lower precedence than "new".  The Princeton table shows member access as higher in precedence than new, and that just isn't accurate.

The Java Language Specification actually lists the sections of JLS Chapter 15: Expressions in order of precedence - at least generally.  There may be small deviations from this, that I haven't noticed.  And they don't directly say that it's in order of precedence.  They don't talk about precedence directly, at all.  But they present in order of precedence, because that's the clearest way to lay out the grammer rules as they have defined them, which implicitly spell out the order of precedence.

I know you don't like reading the JLS.  But, that's where the answer is.  Not a lot I can do about that.  Here's the relevant excerpt:

15.8. Primary Expressions
15.8.1. Lexical Literals
15.8.2. Class Literals
15.8.3. this
15.8.4. Qualified this
15.8.5. Parenthesized Expressions
15.9. Class Instance Creation Expressions
15.10. Array Creation and Access Expressions
15.11. Field Access Expressions
15.12. Method Invocation Expressions
15.13. Method Reference Expressions
15.14. Postfix Expressions
15.15. Unary Operators
15.16. Cast Expressions
15.17. Multiplicative Operators
15.18. Additive Operators
15.19. Shift Operators
15.20. Relational Operators
15.21. Equality Operators
15.22. Bitwise and Logical Operators
15.23. Conditional-And Operator &&
15.24. Conditional-Or Operator ||
15.25. Conditional Operator ? :
15.26. Assignment Operators
15.27. Lambda Expressions
15.28. switch Expressions
15.29. Constant Expressions


"new" is covered by 15.9. Class Instance Creation Expressions.  "." is covered by 15.11. Field Access Expressions, and 15.12. Method Invocation Expressions.  So, "new" is higher precedence than ".".

Anil, for whatever reason, Java doesn't new and . as operators.  Unlike C.  But there are lots of other language constructs that aren't operators either, and those also have precedence.  Those are all spelled out in the details of the JLS.
 
Anil Philip
Ranch Foreman
Posts: 626
2
  • Number of slices to send:
    Optional 'thank-you' note:

Mike Simmons wrote:And they don't directly say that it's in order of precedence.  They don't talk about precedence directly, at all.  But they present in order of precedence, because that's the clearest way to lay out the grammer rules as they have defined them, which implicitly spell out the order of precedence.
15.9. Class Instance Creation Expressions
15.10. Array Creation and Access Expressions
15.11. Field Access Expressions
But there are lots of other language constructs that aren't operators either, and those also have precedence.  Those are all spelled out in the details of the JLS.



Thanks. I think it is an assumption that these are in order of precedence - unless they say so.
I don't know why they wouldn't make a table.
I just now realized that the array braces [ ] are not in the Oracle table either.


Is your assumption correct? Because this is the error I get:

java: array required, but java.lang.String found



In your list, array access is listed in the order.



 
Mike Simmons
Master Rancher
Posts: 4905
74
  • Number of slices to send:
    Optional 'thank-you' note:
Well, Strings aren't arrays.  There's no problem with precedence there, but Java does not support treating a String as an array.  Other languages do, because it's a concise way to access an individual character, but Java does not.
 
Mike Simmons
Master Rancher
Posts: 4905
74
  • 1
  • Number of slices to send:
    Optional 'thank-you' note:

Anil Philip wrote:Thanks. I think it is an assumption that these are in order of precedence - unless they say so.


Agreed.  The closest I found was from JLS 15.2 Forms of Expressions:

Precedence among operators is managed by a hierarchy of grammar productions. The lowest precedence operator is the arrow of a lambda expression (->), followed by the assignment operators. Thus, all expressions are syntactically included in the LambdaExpression and AssignmentExpression nonterminals:

Expression:
LambdaExpression
AssignmentExpression


Here they're limiting the discussion to operators again, and ignoring that there are other non-operator constructs that we need to parse too. And they're still not saying that they are presenting them in order.  But they are telling us that the precedence is implicit in the grammar rules, and giving lambda expression as the lowest-precedence starting point among the operators.

Anil Philip wrote:I don't know why they wouldn't make a table.


That would be too easy.  They wouldn't want to do that.

(I don't know either.  Clearly, the tutorial people thought a table would be useful.  But they didn't maintain it.)

The Princeton table is from the Algorithms book by Sedgewick.  He's a pretty smart cookie, but not specifically a Java guy.  I think his coauthor Kevin Wayne translated the previous edition of the book into Java.  Well, someone else first translated C code into rather odd-looking Java, and then Kevin came along and converted it to respectable Java.  That table may be mostly accurate, but the "member access" at the top seems clearly wrong.  This warrants further study, I think.
 
Anil Philip
Ranch Foreman
Posts: 626
2
  • Number of slices to send:
    Optional 'thank-you' note:

Mike Simmons wrote:Well, Strings aren't arrays.  There's no problem with precedence there, but Java does not support treating a String as an array.  Other languages do, because it's a concise way to access an individual character, but Java does not.



But I want to make a String array using its copy-constructor.
This

not this...

 
Saloon Keeper
Posts: 15608
366
  • Number of slices to send:
    Optional 'thank-you' note:
You can't make an array using the constructor of some arbitrary class.

It's not clear what end result you're trying to achieve, but what you wrote is not valid Java syntax.
 
Anil Philip
Ranch Foreman
Posts: 626
2
  • Number of slices to send:
    Optional 'thank-you' note:
off topic: @Tim

Tim Holloway wrote:The secret of how to be miserable is to constantly expect things are going to happen the way that they are "supposed" to happen.
You can have faith, which carries the understanding that you may be disappointed. Then there's being a willfully-blind idiot, which virtually guarantees it.



I actually put your quote on Facebook (giving you credit as the author) and my cousin in Australia sent me this message:

I saw the quote before. I agree but I didn't understand the last bit. What is it saying about the "willfully-blind idiot"? What are they guaranteed?



I replied that you will have to ask the guy [Tim] yourself. and also,
"I think he means you are guaranteed disappointment if you are a "willfully blind idiot" who does not face reality. But then that makes no sense because faith is not about reality"
 
Mike Simmons
Master Rancher
Posts: 4905
74
  • Number of slices to send:
    Optional 'thank-you' note:
SvH++

Neither of us can tell what you want to do here, and neither can the compiler.  I retract my comment about other languages supporting it, because I was thinking of something different than you are, apparently.

Also, it's "Buonaserata", since "serata" is feminine.
 
Anil Philip
Ranch Foreman
Posts: 626
2
  • Number of slices to send:
    Optional 'thank-you' note:

Mike Simmons wrote:SvH++

Neither of us can tell what you want to do here, and neither can the compiler.  

Also, it's "Buonaserata", since "serata" is feminine.



Thanks for the grammar correction!

I want to create an array of String using
as the constructor, not an array containing nulls.
 
Mike Simmons
Master Rancher
Posts: 4905
74
  • Number of slices to send:
    Optional 'thank-you' note:
What would the contents look like?  Is it an array of size one, whose single value is "Buonaserata"?  Is it an array of one-character strings, like "B", "u", "o", "n", "a", "s", "e", "r", "a", "t", "a"?  What did the number 3 have to do with the array you were creating?
 
Marshal
Posts: 28258
95
  • Number of slices to send:
    Optional 'thank-you' note:

Anil Philip wrote:I want to create an array of String using
as the constructor, not an array containing nulls.


How about this ?Admittedly it doesn't use the constructor you asked about, but then there isn't any such code which could fulfil that requirement. And also it produces an array of chars, not an array of Strings, but perhaps one could build on it. A Stream with a Collector maybe.
 
Mike Simmons
Master Rancher
Posts: 4905
74
  • Number of slices to send:
    Optional 'thank-you' note:
That would be an array of chars, rather than an array of Strings.  But yes, it's another possible interpretation.

Another is

which is an array of Strings with just one element, equivalent to
 
Marshal
Posts: 79392
377
  • Number of slices to send:
    Optional 'thank-you' note:

Mike Simmons wrote:Unfortunately, that Princeton table is incorrect. . . .

It is also incorrect in not mentioning the left‑to‑right rule.
 
Mike Simmons
Master Rancher
Posts: 4905
74
  • Number of slices to send:
    Optional 'thank-you' note:
I would say that it's incomplete in that respect, not inaccurate.  The left-to-right rule is about order of execution, not precedence.  The table doesn't seem to make any claims about order of execution.  That may or may not be covered somewhere else in their book; I don't know.  There are a lot of other topics not directly addressed by the table - I'm not sure which others they're responsible for covering here.

Also, I was wrong earlier - it's not from their Algorithms book, but from a newer book, Computer Science: An Interdisciplinary Approach.
 
Stephan van Hulst
Saloon Keeper
Posts: 15608
366
  • 1
  • Number of slices to send:
    Optional 'thank-you' note:

Anil Philip wrote:I want to create an array of String using
as the constructor, not an array containing nulls.


So, I'm guessing the end goal is an array containing three copies of the same string. You can do that in a few different ways:



 
Campbell Ritchie
Marshal
Posts: 79392
377
  • Number of slices to send:
    Optional 'thank-you' note:

Stephan van Hulst wrote:. . .

Won't that make var mean Object[]? Don't you prefer to add String[]::new somewhere?
 
Stephan van Hulst
Saloon Keeper
Posts: 15608
366
  • 1
  • Number of slices to send:
    Optional 'thank-you' note:
You are absolutely right, it should read:
 
Tim Holloway
Saloon Keeper
Posts: 27851
196
  • Number of slices to send:
    Optional 'thank-you' note:
A note regarding treating Strings like character arrays.

Technically, there's no reason why you cannot make the "[]" operator set equivalent to charAt(), and if enough people pushed for it, it might even be so, some day.

But there are some caveats. First, in its current internal incarnation, you cannot always determine charAt(x) via simple byte offset, since String now allows some variable-length character constructs in the interests of storage efficiency.

More importantly, though, String is immutable. In a normal Java array, a "x[y]" is a legitimate lefthand-side (LHS) expression token in all cases. And since String is a subclass of class Object, you couldn't always detect mis-use at compile time.

I am amused that I've stirred up interest in my signature. And incidentally, the signature is not preserved with the posting, so when I change it, people reading that discussion are likely to become confused, but so it goes.

It was once popular to say "Faith is believing what you know ain't so". That, of course is nonsense. Faith is believing what you fear might not be so, but you hope otherwise. Believing in the obviously false — that is, being willfully blind to reality — is the mark of an idiot.

Speaking of which, I've been considering a shorter maxim: "An idiot is never wrong".
 
Mike Simmons
Master Rancher
Posts: 4905
74
  • Number of slices to send:
    Optional 'thank-you' note:
Not really connected to the original question, but interesting anyway:

Tim Holloway wrote:Technically, there's no reason why you cannot make the "[]" operator set equivalent to charAt(), and if enough people pushed for it, it might even be so, some day.


And indeed, this has been already done in some other Java-based languages like Groovy and Kotlin, which offer Java-like syntax with various enhancements.  That's why I first guessed that this might be what Anil meant by that code.  But, apparently not.

Tim Holloway wrote:But there are some caveats. First, in its current internal incarnation, you cannot always determine charAt(x) via simple byte offset, since String now allows some variable-length character constructs in the interests of storage efficiency.


Right, String.charAt(i) might be quite different from arr[i] of the underlying array.  But, the underlying array is an implementation detail, not what the API supports.

More concerning from my point of view, nowadays we might want to talk about code points rather than chars - and those can also be different.  Should str[i] return str.charAt(i) or str.codePointAt(i)?  That's less obvious today, I think.  Pretty sure Groovy and Kotlin went with str.charAt(i), though.

Tim Holloway wrote:More importantly, though, String is immutable. In a normal Java array, a "x[y]" is a legitimate lefthand-side (LHS) expression token in all cases. And since String is a subclass of class Object, you couldn't always detect mis-use at compile time.


I don't think that's much of a problem.  A change like this would require changes to the compiler anyway.  I think if the compiler sees "foo[i]", it should already have some info on the type of foo, and based on that, handle it differently if foo is an array, than if foo is a String.  It wouldn't be too difficult to error out if someone tries to use a String.get(i) as the left hand side of an assignment.  I mean, relative to to the difficulty of implementing the rest of the change.

Even more useful than these String enhancements would be to do the same sort of thing for a List - so list[i] means either list.get(i) or list.set(i, RHS), depending on context.

I don't expect such changes to come anytime soon.  But, you never know.  It wouldn't break any current code, since no existing code could be using this syntax for a String or List.  It could happen... I can dream...
 
Tim Holloway
Saloon Keeper
Posts: 27851
196
  • Number of slices to send:
    Optional 'thank-you' note:

I don't think the compiler could catch this. It would need a runtime Exception.
 
Campbell Ritchie
Marshal
Posts: 79392
377
  • Number of slices to send:
    Optional 'thank-you' note:

Tim Holloway wrote:A note regarding treating Strings like character arrays.. . .

That would be inconsistent with the current state of the JLS. It would make it into a different language. I think that section is there to warn people that Java®≠C++. I don't know much C++, but I know, in C, there is no such thing as a String, only a char[], maybe more precisely, a char*.

"An idiot is never wrong".

Nor am I

Oops. What did I say?
 
Tim Holloway
Saloon Keeper
Posts: 27851
196
  • Number of slices to send:
    Optional 'thank-you' note:
Every new version of Java changes the state of the JLS. That's what makes it a new version.

There is no native string type in C or C++, but the standard libraries now define a class "string", and as in Java (internally), it's an OOP wrapper for an array of characters.

Note also that in C++, "[]" is an overloadable operator, so you can redefine how "[]" works on a string class to do anything you want it to. Note that I avoided the term "indexing" there, precisely for that reason.

Also I botched my array assignment example, as I'm sure you'll point out. But I suspect that something more valid is likely to be possible.

 
Mike Simmons
Master Rancher
Posts: 4905
74
  • Number of slices to send:
    Optional 'thank-you' note:

Tim Holloway wrote:
I don't think the compiler could catch this. It would need a runtime Exception.


I wouldn't suggest that a String is a subtype of Object[], no. If that's what you were arguing against, I agree.  But they could still allow the syntactic sugar that str[i] just means str.charAt(i), without attempting to retroactively treat the whole array as an array.  Of course str[i] not be a valid for LHS, and that would have to be a compile-time error.
 
Campbell Ritchie
Marshal
Posts: 79392
377
  • Number of slices to send:
    Optional 'thank-you' note:

Tim Holloway wrote:. . . It would need a runtime Exception.

Is that what I was supposed to notice?
 
Mike Simmons
Master Rancher
Posts: 4905
74
  • Number of slices to send:
    Optional 'thank-you' note:

Tim Holloway wrote:Every new version of Java changes the state of the JLS. That's what makes it a new version.


Theoretically, they could have a new version that just has library changes, without changing the language spec at all.  I think that did happen a few times in the past, JDK 1.4, 6, and I'm not sure about 1.2 and 1.3.  Those might be more in indicative of a lag between the JDK and the formal spec ,though.  Nowadays Oracle is a lot more open to language change than Sun was, so we tend to always have language changes.  But I don't think it's necessary to having a new version.
 
Ranch Hand
Posts: 55
1
  • 1
  • Number of slices to send:
    Optional 'thank-you' note:
A String can be easily broken into characters using the String.codePoints() method.

java.lang.String.codePoints wrote:codePoints
public IntStream codePoints()
Returns a stream of code point values from this sequence. Any surrogate pairs encountered in the sequence are combined as if by Character.toCodePoint and the result is passed to the stream. Any other code units, including ordinary BMP characters, unpaired surrogates, and undefined code units, are zero-extended to int values which are then passed to the stream.
Specified by:
codePoints in interface CharSequence
Returns:
an IntStream of Unicode code points from this sequence
Since:
9

 
Mike Simmons
Master Rancher
Posts: 4905
74
  • 1
  • Number of slices to send:
    Optional 'thank-you' note:
Well, it can be broken into code points with codePoints(), and it can be broken into chars with chars().  Breaking it into "characters" is kind of ambiguous under the circumstances, I think.  But yes, String has methods, and many others.  My point in bringing up code points was not to know how we handle them - there are a number of methods for this.  But for this hypothetical "str[i]" notation, it's not clear whether it should refer to chars or code points, since they are different things.
 
Ira Go
Ranch Hand
Posts: 55
1
  • Number of slices to send:
    Optional 'thank-you' note:

Mike Simmons wrote:My point in bringing up code points was not to know how we handle them - there are a number of methods for this.  But for this hypothetical "str[i]" notation, it's not clear whether it should refer to chars or code points, since they are different things.


Sorry, I need to re-read the thread more carefully.  (Why would anyone want [] on Java strings? It looks like such a mess, imho)
 
Stephan van Hulst
Saloon Keeper
Posts: 15608
366
  • 1
  • Number of slices to send:
    Optional 'thank-you' note:
Many languages support the use of [] on objects that represent a homogeneous sequence of elements. A string is literally that: "a sequence of characters strung together". I think many developers coming from other languages find it surprising that Java doesn't support the [] operator on strings.

The problem with strings is that historically, we have never really had a good idea of what a "character" really is. It started with the rather limited notion of "a letter from the latin alphabet", but even now there is no universally accepted definition of a "character". Is it a code unit of the string's underlying encoding? Is it a code point from the Unicode character set? Is it a combined graphical glyph?

I don't have a problem with using the [] operator to index characters from a string, but there must be a clear and sensible definition of "character". There's no way to make everybody happy. Returning code points will make developers that are used to char unhappy, and returning char will make everybody that is sick of working with UTF-16 unhappy.

To me, the [] operator also implies random access, so if, for example, it is used to retrieve code points, that precludes the string from storing its contents as an array of UTF-8 bytes.
 
Mike Simmons
Master Rancher
Posts: 4905
74
  • 2
  • Number of slices to send:
    Optional 'thank-you' note:
Yeah, I think String is still, fundamentally, very much a sequence of chars, rather than a sequence of code points.  They weren't even thinking about code points when they originally designed it.  And so it would only make sense for str[i] to mean str.charAt(i), rather than str.codePointAt(i).  For one thing, the i in both cases is the index as measured in chars, even if you're trying to use code points.  A String like

contains 3 code points, but 6 chars.  And to access those points using codePointAt(), you would need str.codePointAt(0), str.codePointAt(2), str.codePointAt(4).  So it would be very confusing to mix this with the simple-looking notation of str[i].  It would be much more straightforward for str[i] to represent str.charAt(i).

It's a good point about random access - though it's not any sort of formal guarantee for things using [].  That's just what we're used to with arrays.  And if you use UTF-8, that problem is already present for charAt() anyway.  That's probably why current String implementations use either latin-1 or UTF-16 internally, so that they can easily guarantee fast random access.
 
Mike Simmons
Master Rancher
Posts: 4905
74
  • Number of slices to send:
    Optional 'thank-you' note:
Going back to the original point of the thread though, I reviewed the Princeton table at

https://introcs.cs.princeton.edu/java/11precedence/

and it seems that all the other operators and language constructs are in the right order.  Only object creation with "new" is out of place - it should be at the top, level 16, rather than in level 13.  The other operation originally at level 13, the cast operation, is correct in being placed at that level.  
 
Mike Simmons
Master Rancher
Posts: 4905
74
  • 4
  • Number of slices to send:
    Optional 'thank-you' note:
Well, that was fast.  I dropped a note to the book authors Robert Sedgewick and Kevin Wayne, and Kevin responded an hour later saying he agreed and has already updated the table.  He also added entries for switch expressions and method references, which were later features missing from the table.  So, go me! . Also thanks to newly-certified Anil Philips, who raised the issue in the first place, in the context of a different table.  And to Tim Holloway who pointed out the Princeton table, which is more complete.  This is now the best online reference, I think, for a quick summary of operator precedence:

https://introcs.cs.princeton.edu/java/11precedence/
 
Campbell Ritchie
Marshal
Posts: 79392
377
  • Number of slices to send:
    Optional 'thank-you' note:

Mike Simmons wrote:. . . They weren't even thinking about code points when they originally designed it. . . .

Code points hadn't been invented back then. I think it is Cay Horstmann who said that code points took Java® by surprise, even though Strings were designed to use Unicode from day one.
 
Stephan van Hulst
Saloon Keeper
Posts: 15608
366
  • Number of slices to send:
    Optional 'thank-you' note:
Code points were invented back then. They're much older than Unicode.

The thing is that early versions of Java used Unicode 1.1.5, which used UCS-2 to encode code points. So a char really was equal to a Unicode code point.

Unicode 2.0 replaced UCS-2 with UTF-16. UTF-16 is backwards compatible with UCS-2, but it encodes a superset of the character set encoded by UCS-2, which also meant that a 2 byte char could no longer represent all code points.
 
Tim Holloway
Saloon Keeper
Posts: 27851
196
  • Number of slices to send:
    Optional 'thank-you' note:

Stephan van Hulst wrote:
Unicode 2.0 replaced UCS-2 with UTF-16. UTF-16 is backwards compatible with UCS-2, but it encodes a superset of the character set encoded by UCS-2, which also meant that a 2 byte char could no longer represent all code points.



However, as mentioned previously, a String is not intended to be literally a set of code points, but rather a sequence of Characters. How many bytes a given code point requires (especially if compaction algorithms are applied) is not germane to the issue.

It's a mistake to think that "char" == "byte". It was never literally true, just that on popular machines way back then, you could fit a character's ASCII or EBCDIC code point into a byte. A byte being defined as the smallest number of bits directly addressible by a machine instruction. Which may or may not also be the smallest directly-addressible unit of RAM and whose actual size thus depended on the hardware, not an absolute number of bits. You can see this in things like communications specs, which define in "octets" - units of 8 bits - rather than bytes. Addressability aside, octets don't necessarily have to be boundary-aligned. After all, in a serial data stream, there are no physical boundaries. And no, pounding out the bits in 8-to-12 bit TTY packets doesn't count.

Before EBCDIC, there was BCDIC, which was a 6-bit character code. You'd have to be either more ancient than I am or involved with, say, the internal workings of IBM's 3270-predecessor CRT display system to have worked with that one, though. More recently, actual ASCII was a 7-bit character code, which is why both ASCII and ASCIIZ (its extension for IBM PCs) both got held in bytes.

All of the major modern-day computers work on 8-bit bytes, and that is the official extent of a byte in Java. But that's simply how the standard was defined.
 
Stephan van Hulst
Saloon Keeper
Posts: 15608
366
  • 1
  • Number of slices to send:
    Optional 'thank-you' note:

Tim Holloway wrote:However, as mentioned previously, a String is not intended to be literally a set of code points, but rather a sequence of Characters.


I posit that when Java was first developed, the designers considered the two equal. Character == char == Unicode code point == UCS-2 byte sequence.

That's obviously not true anymore, but I'm certain that's what they thought back at the beginning.

I bet that if the Java designers had developed String after the introduction of Unicode 2.0, char would have represented a Unicode code point as we currently know it, regardless of how a sequence of code points is encoded internally.

If they did a do-over, I'm not sure that it would be wise to commit to a fixed byte size for a char, or even to make it a numeric primitive.

I'm also not sure if it is a good idea to allow the serialization of a single char. I'm thinking serialization should always occur on a String as a whole.

How many bytes a given code point requires (especially if compaction algorithms are applied) is not germane to the issue.


It is though. The original thinking was that all possible characters could easily be encoded in two bytes and it's the reason that char is now fixed at two bytes, and it's also the reason why it's folly to think of a char as a "character". The only correct way of thinking of char as we currently know it is "UTF-16 code unit".

It's a mistake to think that "char" == "byte".


I don't think anybody tried to make that point.
 
Mike Simmons
Master Rancher
Posts: 4905
74
  • 1
  • Number of slices to send:
    Optional 'thank-you' note:
SvH++

I am 100% in agreement with everything Stephan just posted.  Well said!
 
Paul Clapham
Marshal
Posts: 28258
95
  • Number of slices to send:
    Optional 'thank-you' note:
Although I wasn't working with languages similar to Java in 1995, it's my impression that Java was the first language to commit to Unicode. This was actually a radical step, since such languages tended to have the "poverty viewpoint". Minimization of space and time was very important and the idea of using two bytes when only one was (almost always) necessary would have been a great extravagance. They could have chosen to store text data in an array of bytes with "code points" being a way of storing characters which needed more than 8 bits (UTF-8), but instead they decided to keep it simple and define String the way they did. I'm sure there was a lot of griping about that, with people storing their text data in byte arrays to save space.

Ironically they were forced to implement "code points" later, when UTF-16 wasn't enough!
 
Don't get me started about those stupid light bulbs.
reply
    Bookmark Topic Watch Topic
  • New Topic
vceplus-200-125    | boson-200-125    | training-cissp    | actualtests-cissp    | techexams-cissp    | gratisexams-300-075    | pearsonitcertification-210-260    | examsboost-210-260    | examsforall-210-260    | dumps4free-210-260    | reddit-210-260    | cisexams-352-001    | itexamfox-352-001    | passguaranteed-352-001    | passeasily-352-001    | freeccnastudyguide-200-120    | gocertify-200-120    | passcerty-200-120    | certifyguide-70-980    | dumpscollection-70-980    | examcollection-70-534    | cbtnuggets-210-065    | examfiles-400-051    | passitdump-400-051    | pearsonitcertification-70-462    | anderseide-70-347    | thomas-70-533    | research-1V0-605    | topix-102-400    | certdepot-EX200    | pearsonit-640-916    | itproguru-70-533    | reddit-100-105    | channel9-70-346    | anderseide-70-346    | theiia-IIA-CIA-PART3    | certificationHP-hp0-s41    | pearsonitcertification-640-916    | anderMicrosoft-70-534    | cathMicrosoft-70-462    | examcollection-cca-500    | techexams-gcih    | mslearn-70-346    | measureup-70-486    | pass4sure-hp0-s41    | iiba-640-916    | itsecurity-sscp    | cbtnuggets-300-320    | blogged-70-486    | pass4sure-IIA-CIA-PART1    | cbtnuggets-100-101    | developerhandbook-70-486    | lpicisco-101    | mylearn-1V0-605    | tomsitpro-cism    | gnosis-101    | channel9Mic-70-534    | ipass-IIA-CIA-PART1    | forcerts-70-417    | tests-sy0-401    | ipasstheciaexam-IIA-CIA-PART3    | mostcisco-300-135    | buildazure-70-533    | cloudera-cca-500    | pdf4cert-2v0-621    | f5cisco-101    | gocertify-1z0-062    | quora-640-916    | micrcosoft-70-480    | brain2pass-70-417    | examcompass-sy0-401    | global-EX200    | iassc-ICGB    | vceplus-300-115    | quizlet-810-403    | cbtnuggets-70-697    | educationOracle-1Z0-434    | channel9-70-534    | officialcerts-400-051    | examsboost-IIA-CIA-PART1    | networktut-300-135    | teststarter-300-206    | pluralsight-70-486    | coding-70-486    | freeccna-100-101    | digitaltut-300-101    | iiba-CBAP    | virtuallymikebrown-640-916    | isaca-cism    | whizlabs-pmp    | techexams-70-980    | ciscopress-300-115    | techtarget-cism    | pearsonitcertification-300-070    | testking-2v0-621    | isacaNew-cism    | simplilearn-pmi-rmp    | simplilearn-pmp    | educationOracle-1z0-809    | education-1z0-809    | teachertube-1Z0-434    | villanovau-CBAP    | quora-300-206    | certifyguide-300-208    | cbtnuggets-100-105    | flydumps-70-417    | gratisexams-1V0-605    | ituonline-1z0-062    | techexams-cas-002    | simplilearn-70-534    | pluralsight-70-697    | theiia-IIA-CIA-PART1    | itexamtips-400-051    | pearsonitcertification-EX200    | pluralsight-70-480    | learn-hp0-s42    | giac-gpen    | mindhub-102-400    | coursesmsu-CBAP    | examsforall-2v0-621    | developerhandbook-70-487    | root-EX200    | coderanch-1z0-809    | getfreedumps-1z0-062    | comptia-cas-002    | quora-1z0-809    | boson-300-135    | killtest-2v0-621    | learncia-IIA-CIA-PART3    | computer-gcih    | universitycloudera-cca-500    | itexamrun-70-410    | certificationHPv2-hp0-s41    | certskills-100-105    | skipitnow-70-417    | gocertify-sy0-401    | prep4sure-70-417    | simplilearn-cisa    |
http://www.pmsas.pr.gov.br/wp-content/    | http://www.pmsas.pr.gov.br/wp-content/    |