What's the real size of char? - Programmers Heaven

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories

Welcome to the new platform of Programmer's Heaven! We apologize for the inconvenience caused, if you visited us from a broken link of the previous version. The main reason to move to a new platform is to provide more effective and collaborative experience to you all. Please feel free to experience the new platform and use its exciting features. Contact us for any issue that you need to get clarified. We are more than happy to help you.

What's the real size of char?

innocent_alinnocent_al Posts: 3Member
I was reading Bruce Eckel's Thinking in C++, which said that

"A char is for character storage and uses a minimum of 8 bits (one byte) of storage, although it may be larger. "

This is inconsistent with another book that I read, Programming in C++ by Nell Dale et al., which states that

"Then by definition, sizeof(char) = 1

...

The only guarantee made by the C++ language is that:

1 = sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)

and

sizeof(float) <= sizeof(double) <= sizeof(long double) "

Which is which?

Alvin

Comments

  • kerenkeren Posts: 14Member
    It's system independent.

    I suppose compilers are allowed to allocate however much they wish for a char. I'd bet that all compilers today allocate 1 byte (thus sizeof(char) == 1) because it seems wasteful to allocate more for ASCII chars. On your system, write a program that does

    cout << "char size is " << sizeof(char) << endl;

    and you'll find out what the case is in your system... Anyhow, the size will be in bytes, and in whole numbers (always). Maybe one day they will switch to unicode and allocate more... As in Java, they use 16-bit Unicode characters, to allow a greater character set. So why limit the language?

    Keren

    : I was reading Bruce Eckel's Thinking in C++, which said that
    :
    : "A char is for character storage and uses a minimum of 8 bits (one byte) of storage, although it may be larger. "
    :
    : This is inconsistent with another book that I read, Programming in C++ by Nell Dale et al., which states that
    :
    : "Then by definition, sizeof(char) = 1
    :
    : ...
    :
    : The only guarantee made by the C++ language is that:
    :
    : 1 = sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)
    :
    : and
    :
    : sizeof(float) <= sizeof(double) <= sizeof(long double) "
    :
    : Which is which?
    :
    : Alvin
    :

  • kerenkeren Posts: 14Member
    I meant, the actual size of char in bytes is system DEpendent, but sizeof(char) will be the correct size of a char on your system, independently of the system..

    K

    : It's system independent.
    :
    : I suppose compilers are allowed to allocate however much they wish for a char. I'd bet that all compilers today allocate 1 byte (thus sizeof(char) == 1) because it seems wasteful to allocate more for ASCII chars. On your system, write a program that does
    :
    : cout << "char size is " << sizeof(char) << endl;
    :
    : and you'll find out what the case is in your system... Anyhow, the size will be in bytes, and in whole numbers (always). Maybe one day they will switch to unicode and allocate more... As in Java, they use 16-bit Unicode characters, to allow a greater character set. So why limit the language?
    :
    : Keren
    :
    : : I was reading Bruce Eckel's Thinking in C++, which said that
    : :
    : : "A char is for character storage and uses a minimum of 8 bits (one byte) of storage, although it may be larger. "
    : :
    : : This is inconsistent with another book that I read, Programming in C++ by Nell Dale et al., which states that
    : :
    : : "Then by definition, sizeof(char) = 1
    : :
    : : ...
    : :
    : : The only guarantee made by the C++ language is that:
    : :
    : : 1 = sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)
    : :
    : : and
    : :
    : : sizeof(float) <= sizeof(double) <= sizeof(long double) "
    : :
    : : Which is which?
    : :
    : : Alvin
    : :
    :
    :

  • Eric TetzEric Tetz Posts: 2,141Member
    : I meant, the actual size of char in bytes is system DEpendent

    The size of a 'char' in [italic]bits[/italic] is system dependent (but must be at least 8). The size of char in bytes is 1, by definition.

    : but sizeof(char) will be the correct size of a char on your system, independently of the system

    The sizeof(char) is 1, always.

    Cheers,
    Eric

  • Eric TetzEric Tetz Posts: 2,141Member
    : I was reading Bruce Eckel's Thinking in C++, which said that
    :
    : "A char is for character storage and uses a [b]minimum of 8 bits[/b] (one byte) of storage, although it may be larger. "
    :
    : This [b]is inconsistent with[/b] another book that I read, Programming in C++ by Nell Dale et al., which states that
    :
    : "Then by definition, [b]sizeof(char) = 1[/b]

    That's not inconsistent. sizeof() returns the number [italic]bytes[/italic], not the number of bits.

    Cheers,
    Eric
  • roland_changroland_chang Posts: 27Member
    : I was reading Bruce Eckel's Thinking in C++, which said that
    :
    : "A char is for character storage and uses a minimum of 8 bits (one byte) of storage, although it may be larger. "
    :
    : This is inconsistent with another book that I read, Programming in C++ by Nell Dale et al., which states that
    :
    : "Then by definition, sizeof(char) = 1
    :
    : ...
    :
    : The only guarantee made by the C++ language is that:
    :
    : 1 = sizeof(char) <= sizeof(short) <= sizeof(int) <= sizeof(long)
    :
    : and
    :
    : sizeof(float) <= sizeof(double) <= sizeof(long double) "
    :
    : Which is which?
    :
    : Alvin
    :

    I think the traditional definition of char is 8 bits( 1byte ) for standard ASCII character set. The introduction of C++ and the requirement of internationalization. Support of character set more than 256 characters is required. Windows uses UNICODE, in windows inplementation use 2 bytes to hold a character, including ASCII and chinese, japanese ... etc. You can use HEXDUMP utility to see the windows executable file's message, it usually 2 bytes for each character.



  • innocent_alinnocent_al Posts: 3Member
    : : I was reading Bruce Eckel's Thinking in C++, which said that
    : :
    : : "A char is for character storage and uses a [b]minimum of 8 bits[/b] (one byte) of storage, although it may be larger. "
    : :
    : : This [b]is inconsistent with[/b] another book that I read, Programming in C++ by Nell Dale et al., which states that
    : :
    : : "Then by definition, [b]sizeof(char) = 1[/b]
    :
    : That's not inconsistent. sizeof() returns the number [italic]bytes[/italic], not the number of bits.
    :
    : Cheers,
    : Eric
    :

    Well... To me, it's inconsistent because the book by Eckel states that it can be larger than 8 bits (which could mean 16 bits = 2 bytes) , while the book by Dale states that it has to be 1 byte no matter what.

    Thanks,

    Alvin.
  • Eric TetzEric Tetz Posts: 2,141Member
    : Well... To me, it's inconsistent because the book by Eckel states that it can be larger than 8 bits (which could mean 16 bits = 2 bytes) , while the book by Dale states that it has to be 1 byte no matter what.

    Ahhh, I see what's confusing you...

    Actually, as far as the language definition is concerned, a byte [italic]is[/italic] a char, and it doesn't necessarily have 8 bits. It has [italic]at least[/italic] 8 bits, but could have more. On some machines, a char (i.e. a byte) may have 9 bits, or 11, or 64, or . In all cases, sizeof(char) is 1.

    Cheers,
    Eric

  • innocent_alinnocent_al Posts: 3Member
    : Actually, as far as the language definition is concerned, a byte [italic]is[/italic] a char, and it doesn't necessarily have 8 bits. It has [italic]at least[/italic] 8 bits, but could have more. On some machines, a char (i.e. a byte) may have 9 bits, or 11, or 64, or . In all cases, sizeof(char) is 1.

    Thanks. I'm gaining a new insight on this. :) So, my assumption that 1 byte = 8 bits was wrong, and your message implies that the size of a byte depends on machines. But why is a byte always said to be equal to 8 bits?

    or are we referring to byte (in this situation) as a certain machine's smallest size for a variable or something? (and isn't that called the word size?)

    Alvin.
  • sauronsauron Posts: 61Member
    [b][red]This message was edited by the sauron at 2002-5-6 14:2:26[/red][/b][hr]
    :I think the traditional definition of char is 8 bits( 1byte ) for standard ASCII character set. The introduction of C++ and the requirement of internationalization. Support of character set more than 256 characters is required. Windows uses UNICODE, in windows inplementation use 2 bytes to hold a character, including ASCII and chinese, japanese ... etc. You can use HEXDUMP utility to see the windows executable file's message, it usually 2 bytes for each character.


    Indeed, most personal computers run on the assumption that 8 bits is 1 byte, but this does not need to be the case, and indeed, ASCII does not assume such a thing. The original ASCII code set for our English alphabet (correct me if I am wrong, but it is known as Latin 1 I think) actually only defines 127 codes (seven bits). The other 128 codes (the difference between 2 ^ 7 and 2 ^ 8) are usually used by the creator of the font to define other characters. The normal set that is used in MS-DOS was established by IBM as their extention of the ASCII that filled the extra room with characters such as line drawing and international characters not in Latin 1. This does not limit other OEM machines from using a different set. Now, with the advent of Windows, the extra 128 can be rearranged, since many of them are not needed to draw pictures and boxes. Unicode does define 65536 different characters and other types of codes, but not all Windows programs have to use this. It depends on the brand of Windows that you have installed, and what options you have enabled on that system. I believe the newer versions of Windows have Unicode by default, but some of the older ones (Win98 SE which I run) only have partial support of the Unicode set. Either way, it is best to write your code (if you are programming Windows code) to be able to compile under any circumstances.

    Happy coding,
    Sauron
    Talking much about oneself can also be a means to conceal oneself.
    -- Friedrich Nietzsche



  • whoiewhoie Posts: 672Member
    : But why is a byte always said to be equal to 8 bits?

    Some people live in a small universe. IOW, don't believe everything you read. In other languages, a byte may very well be 8 bits, but C and C++ are designed to be portable languages. Assumptions like that would severely limit their usefulness, so C and C++ simply state that a byte is equal to CHAR_BIT bits for C, or numeric_limits::digits from in C++. It is true that the most common values for both are 8, but that isn't universal. For example, the processor I have been doing the most work on lately, Motorola's DSP56F805, has 16 bit bytes. But sizeof(char) is still 1, incidentally, so is sizeof(int)!


    : or are we referring to byte (in this situation) as a certain machine's smallest size for a variable or something? (and isn't that called the word size?)

    It is the smallest addressable unit of storage, to be technical. Word size usually means the register size of the processor, and that is usually sizeof(int).


    HTH,
    Will
    --
    http://www.tuxedo.org/~esr/faqs/smart-questions.html
    http://www.eskimo.com/~scs/C-faq/top.html
    http://www.parashift.com/c++-faq-lite/
    http://www.accu.org/


Sign In or Register to comment.