Using swprintf with wchar_t...

SephirothSephiroth Fayetteville, NC, USA
I am attempting to do my current big C++ project in unicode instead of classic ANSI, but has hit some enormous roadblocks along the way. This latest one seems impossible to do in unicode, and may require switching back to ANSI for speed and simplicity. I need to copy strings of text into specific locations in an array of wchar_t values. The thing is, while my ANSI code works, the unicode equivalent fails and has cost me hours of searching and trying things. If this requires a character by character copy, ANSI just became superior to unicode.
[code]
wchar_t *pTemp;

pTemp = new wchar_t[256];

for(unsigned short Loop = 0; Loop < Count; Loop++)
swprintf(pTemp[(Loop * 64)], L"%s", pSome64wcharVariable);
[/code]
This fails, as does wmemcpy. In the example above, assume I allocated storage for four null-terminated strings. The pTemp variable would use 0-63 for the first string, 64-127 for the second, 128-192 for the third, and 193-256 for the last. This way I can easily loop through and see what strings I have stored in that array, but I have no clue how to copy in 64 wide characters at a time! Nothing I try works, and I am exhausted trying to find a solution. This is cake in ANSI, but unicode won't allow it. So how can I copy a specified number of characters into a specific location in a string without using a byte-by-byte copy method?

An easier way to think about what I need to do is this. Assume I have an array of wchar_t variables, declared "wchar_t pBuffer[1024];". This gives me enough storage for sixteen "wchar_t pTemp[64];" type of strings. How would I simply write 64 characters at a time to any location in the big array?

-[italic][b][red]S[/red][purple]e[/purple][blue]p[/blue][green]h[/green][red]i[/red][purple]r[/purple][blue]o[/blue][green]t[/green][red]h[/red][/b][/italic]

Comments

  • : I am attempting to do my current big C++ project in unicode instead
    : of classic ANSI, but has hit some enormous roadblocks along the way.
    : This latest one seems impossible to do in unicode, and may require
    : switching back to ANSI for speed and simplicity. I need to copy
    : strings of text into specific locations in an array of wchar_t
    : values. The thing is, while my ANSI code works, the unicode
    : equivalent fails and has cost me hours of searching and trying
    : things. If this requires a character by character copy, ANSI just
    : became superior to unicode.
    : [code]:
    : wchar_t *pTemp;
    :
    : pTemp = new wchar_t[256];
    :
    : for(unsigned short Loop = 0; Loop < Count; Loop++)
    : swprintf(pTemp[(Loop * 64)], L"%s", pSome64wcharVariable);
    : [/code]:
    : This fails, as does wmemcpy. In the example above, assume I
    : allocated storage for four null-terminated strings. The pTemp
    : variable would use 0-63 for the first string, 64-127 for the second,
    : 128-192 for the third, and 193-256 for the last. This way I can
    : easily loop through and see what strings I have stored in that
    : array, but I have no clue how to copy in 64 wide characters at a
    : time! Nothing I try works, and I am exhausted trying to find a
    : solution. This is cake in ANSI, but unicode won't allow it. So how
    : can I copy a specified number of characters into a specific location
    : in a string without using a byte-by-byte copy method?
    :
    : An easier way to think about what I need to do is this. Assume I
    : have an array of wchar_t variables, declared "wchar_t
    : pBuffer[1024];". This gives me enough storage for sixteen "wchar_t
    : pTemp[64];" type of strings. How would I simply write 64 characters
    : at a time to any location in the big array?
    :
    : -[italic][b][red]S[/red][purple]e[/purple][blue]p[/blue][green]h[/gre
    : en][red]i[/red][purple]r[/purple][blue]o[/blue][green]t[/green][red]h
    : [/red][/b][/italic]

    Either use the unicode version of stringcopy, or use memcpy to copy strlen(s) * sizeof(wchar_t) bytes

    Best Regards,
    Richard

    The way I see it... Well, it's all pretty blurry
  • SephirothSephiroth Fayetteville, NC, USA
    : Either use the unicode version of stringcopy, or use memcpy to copy
    : strlen(s) * sizeof(wchar_t) bytes
    :
    : Best Regards,
    : Richard
    :
    : The way I see it... Well, it's all pretty blurry
    I just wrote my own unicode string class that is quite flexible and all is good. I only have one question since Windows is so retarded. Windows frequently uses the "LPCWSTR" type, which is defined as "const WCHAR", and WCHAR is defined as "wchar_t". Now, I have a method in my string class that requires one to pass "wchar_t*" to it. Since Windows can't be standard and uses that type, is it safe to type-cast it?
    [code]
    void ExampleFunction(wchar_t *pText)
    {
    //Assume I do something in here
    }

    //Assume this is my main file
    WNDCLASSEX *pClass;

    pClass->lpszClassName = "DemoClass";
    ExampleFunction((wchar_t*)pClass->lpszClassName);
    [/code]
    Is it safe to typecast this way since all LPCWSTR really is, is "const wchar_t"? If this is fine, then I am all good.

    -[italic][b][red]S[/red][purple]e[/purple][blue]p[/blue][green]h[/green][red]i[/red][purple]r[/purple][blue]o[/blue][green]t[/green][red]h[/red][/b][/italic]
  • : : Either use the unicode version of stringcopy, or use memcpy to copy
    : : strlen(s) * sizeof(wchar_t) bytes
    : :
    : : Best Regards,
    : : Richard
    : :
    : : The way I see it... Well, it's all pretty blurry
    : I just wrote my own unicode string class that is quite flexible and
    : all is good. I only have one question since Windows is so retarded.
    : Windows frequently uses the "LPCWSTR" type, which is defined as
    : "const WCHAR", and WCHAR is defined as "wchar_t". Now, I have a
    : method in my string class that requires one to pass "wchar_t*" to
    : it. Since Windows can't be standard and uses that type, is it safe
    : to type-cast it?
    : [code]:
    : void ExampleFunction(wchar_t *pText)
    : {
    : //Assume I do something in here
    : }
    :
    : //Assume this is my main file
    : WNDCLASSEX *pClass;
    :
    : pClass->lpszClassName = "DemoClass";
    : ExampleFunction((wchar_t*)pClass->lpszClassName);
    : [/code]:
    : Is it safe to typecast this way since all LPCWSTR really is, is
    : "const wchar_t"? If this is fine, then I am all good.
    :

    No, because wchar_t has standard size 2, whereas your string literal has char's of size 1. So your unicode string literal "abc" looks like the Ascii: {'a', 0, 'b', 0, 'c', 0, 0, 0}. Whereas a ascii string literal "abc" looks Unicode-like: { (wchar_t) ('a' + 'b'), (wchar_t)('c' + ''), ...buffer overflow till WORD 0 is found...}.

    Why don't you just use wchar_t everywhere for strings? Windows uses the TCHAR type and tchar header to handle both ascii and unicode characters in a similar way; something like:
    [code]
    #ifdef UNICODE
    typedef TCHAR wchar_t
    #elif
    typedef TCHAR char
    #endif

    // Then all the LPTSTR, LPCTSTR are defined using TCHAR

    //One more significant macro for string literals:
    // Unicode:
    #define _T(q) L ## q
    // Ascii:
    #define _T(q) q
    [/code]

    That way, you can define UNICODE if you want to use Unicode, and everything (the tchar stuff) will automatically adjust without any changes to your code. Use the _T macro on string literals to make sure they are of the right type: _T("Hello"), will convert the ASCII string into Unicode if neccessary. Note that this only works on string literals, and not on any other string variables (ie, this won't work: _T(szName) !)

    Ofcourse, this is all very nice if you want it to be portable etc, but if you've just fixed the string type in your OS, then you can just define CHAR as being wchar_t and then everywhere in your program use CHAR, and for each string literal use the _T() macro (define it as above).

    If you want the two different sizes, you could either overload your function to take both types, or create a helper routine, like:
    [code]
    // One note: was the extra in your other code to create
    // a unicode termination?
    char szText[] = "Something here";
    CString myString = CString::FromAscii(szText);
    [/code]
    Where FromAscii is a static member of your CString class, which returns a string object from an ascii text.

    Anyway, what a ranting :P Good luck

    Best Regards,
    Richard

    The way I see it... Well, it's all pretty blurry
  • SephirothSephiroth Fayetteville, NC, USA
    : No, because wchar_t has standard size 2, whereas your string literal
    : has char's of size 1. So your unicode string literal "abc" looks
    : like the Ascii: {'a', 0, 'b', 0, 'c', 0, 0, 0}. Whereas a ascii
    : string literal "abc" looks Unicode-like: { (wchar_t) ('a' + 'b'),
    : (wchar_t)('c' + ''), ...buffer overflow till WORD 0 is found...}.

    So you're telling me "const whcar_t" stores values in memory differently than regular "wchar_t"? I fail to believe that. Let me show you something out of the Windows typedef headers.
    [code]
    typedef wchar_t WCHAR;
    typedef CONST WCHAR *LPCWSTR;
    [/code]
    So how exactly is LPCWSTR storing things differently? I mean if you look at it, LPCWSTR is the EXACT same as "const wchar_t". To my knowledge, const does not change unicode to ANSI storage. So explain how this is different, because as far as I can tell, LPCWSTR and wchar_t are identical, only LPCWST is constant.

    http://msdn2.microsoft.com/en-us/library/aa383751.aspx

    -[italic][b][red]S[/red][purple]e[/purple][blue]p[/blue][green]h[/green][red]i[/red][purple]r[/purple][blue]o[/blue][green]t[/green][red]h[/red][/b][/italic]
  • : So you're telling me "const whcar_t" stores values in memory
    : differently than regular "wchar_t"? I fail to believe that.

    I'm not saying that. I'm saying char stores things differently than wchar_t.

    The reason for my post was this code of yours:
    [code]
    // Here you are storing a char[] (ascii, 1 byte)
    pClass->lpszClassName = "DemoClass";
    // Here you are changing a char[] to a wchar_t[]
    // Or atleast, trying to. It won't work properly,
    // because all that is changed is the pointer and
    // not the actual data
    ExampleFunction((wchar_t*)pClass->lpszClassName);
    [/code]


    Best Regards,
    Richard

    The way I see it... Well, it's all pretty blurry
  • SephirothSephiroth Fayetteville, NC, USA
    : I'm not saying that. I'm saying char stores things differently than
    : wchar_t.
    :
    : The reason for my post was this code of yours:
    : [code]:
    : // Here you are storing a char[] (ascii, 1 byte)
    : pClass->lpszClassName = "DemoClass";
    : // Here you are changing a char[] to a wchar_t[]
    : // Or atleast, trying to. It won't work properly,
    : // because all that is changed is the pointer and
    : // not the actual data
    : ExampleFunction((wchar_t*)pClass->lpszClassName);
    : [/code]:
    :
    :
    : Best Regards,
    : Richard
    :
    : The way I see it... Well, it's all pretty blurry

    Upon mousing over the variable in VS2005, I get "LPCWSTR tagWNDCLASSEXW::lpszClassName" in a popup window. The class is LPCWSTR when programming in UNICODE.
    [code]
    #if(WINVER >= 0x0400)
    typedef struct tagWNDCLASSEXA {
    UINT cbSize;
    /* Win 3.x */
    UINT style;
    WNDPROC lpfnWndProc;
    int cbClsExtra;
    int cbWndExtra;
    HINSTANCE hInstance;
    HICON hIcon;
    HCURSOR hCursor;
    HBRUSH hbrBackground;
    LPCSTR lpszMenuName;
    LPCSTR lpszClassName;
    /* Win 4.0 */
    HICON hIconSm;
    } WNDCLASSEXA, *PWNDCLASSEXA, NEAR *NPWNDCLASSEXA, FAR *LPWNDCLASSEXA;
    typedef struct tagWNDCLASSEXW {
    UINT cbSize;
    /* Win 3.x */
    UINT style;
    WNDPROC lpfnWndProc;
    int cbClsExtra;
    int cbWndExtra;
    HINSTANCE hInstance;
    HICON hIcon;
    HCURSOR hCursor;
    HBRUSH hbrBackground;
    LPCWSTR lpszMenuName;
    LPCWSTR lpszClassName;
    /* Win 4.0 */
    HICON hIconSm;
    } WNDCLASSEXW, *PWNDCLASSEXW, NEAR *NPWNDCLASSEXW, FAR *LPWNDCLASSEXW;
    #ifdef UNICODE
    typedef WNDCLASSEXW WNDCLASSEX;
    typedef PWNDCLASSEXW PWNDCLASSEX;
    typedef NPWNDCLASSEXW NPWNDCLASSEX;
    typedef LPWNDCLASSEXW LPWNDCLASSEX;
    #else
    typedef WNDCLASSEXA WNDCLASSEX;
    typedef PWNDCLASSEXA PWNDCLASSEX;
    typedef NPWNDCLASSEXA NPWNDCLASSEX;
    typedef LPWNDCLASSEXA LPWNDCLASSEX;
    #endif // UNICODE
    #endif /* WINVER >= 0x0400 */
    [/code]
    As you can clearly see, when using UNICODE, you kind of use UNICODE. Windows sets this data in "Winuser.h", just search for "WNDCLASSEX" and it'll pop right up. So yes, lpszClassName *IS* wchar_t in a UNICODE project like mine. You did answer my question though, that I can cast a "const wchar_t" to "wchar_t". Just remember, when a user tells you they're doing a UNICODE project, they ARE using UNICODE.

    -[italic][b][red]S[/red][purple]e[/purple][blue]p[/blue][green]h[/green][red]i[/red][purple]r[/purple][blue]o[/blue][green]t[/green][red]h[/red][/b][/italic]
  • : Just remember, when a user tells you they're doing a UNICODE project,
    : they ARE using UNICODE.
    :

    Well in the example you posted you weren't using unicode.
    You see, as much as the user wants to be unicode, string literals ("string here") are ASCII by default in VS2005.
    So, for unicode strings:
    [code]
    wchar_t szText[] = "Some text here";
    [/code]
    Does not compile, because "Some text here" is char[], and not wchar_t[].

    Maybe there's a compiler switch that makes all string literals unicode by default, but I don't know. So I assumed that there is no such switch and that you provided code that compiles, so then I had to conclude you were mixing up Unicode and Ascii.

    In my defense, before you start flaming me again, I think I see where this conversation went wrong: I didn't realise you had solved the original problem after my first post and that the second problem was a new (unrelated) question.
    This in mind, this is what I should've/would've posted to your second post:
    "There's no need to typecast it explicitly, since casting from type to const type is an 'upcast', sort of speak. If you insist on explicitly casting it, why not use the cast (LPCWSTR), which'll work in any situation"


    Best Regards,
    Richard

    The way I see it... Well, it's all pretty blurry
  • SephirothSephiroth Fayetteville, NC, USA
    : Well in the example you posted you weren't using unicode.
    : You see, as much as the user wants to be unicode, string literals
    : ("string here") are ASCII by default in VS2005.
    : So, for unicode strings:
    : [code]:
    : wchar_t szText[] = "Some text here";
    : [/code]:
    : Does not compile, because "Some text here" is char[], and not
    : wchar_t[].
    :
    : Maybe there's a compiler switch that makes all string literals
    : unicode by default, but I don't know. So I assumed that there is no
    : such switch and that you provided code that compiles, so then I had
    : to conclude you were mixing up Unicode and Ascii.
    :
    : In my defense, before you start flaming me again, I think I see
    : where this conversation went wrong: I didn't realise you had solved
    : the original problem after my first post and that the second problem
    : was a new (unrelated) question.
    : This in mind, this is what I should've/would've posted to your
    : second post:
    : "There's no need to typecast it explicitly, since casting from type
    : to const type is an 'upcast', sort of speak. If you insist on
    : explicitly casting it, why not use the cast (LPCWSTR), which'll work
    : in any situation"
    :
    :
    : Best Regards,
    : Richard
    :
    : The way I see it... Well, it's all pretty blurry

    I wasn't flaming you, and your example above compiles fine in a UNICODE environment. You can make any project full UNICODE by using the properties screens, accessed by pressing alt+f7. There are several options that need to be changed to do this if your copy of VS2005 defaults to ANSI. Mine defaults to UNICODE now.

    -[italic][b][red]S[/red][purple]e[/purple][blue]p[/blue][green]h[/green][red]i[/red][purple]r[/purple][blue]o[/blue][green]t[/green][red]h[/red][/b][/italic]
Sign In or Register to comment.

Howdy, Stranger!

It looks like you're new here. If you want to get involved, click one of these buttons!

Categories