• R/O
  • HTTP
  • SSH
  • HTTPS

mingw-org-wsl: Commit

The MinGW.OSDN Windows System Libraries. Formerly designated as "MinGW.org Windows System Libraries", this encapsulates the "mingwrt" C runtime library extensions, and the "w32api" 32-bit MS-Windows API libraries.

Please note that this project no longer owns the "MinGW.org" domain name; any software which may be distributed from that domain is NOT supported by this project.


Commit MetaInfo

Revisionfc451f9b494dd0b6bafb17ba9e35efd9a9d815e7 (tree)
Zeit2020-04-08 03:55:12
AutorKeith Marshall <keith@user...>
CommiterKeith Marshall

Log Message

Document MinGW MBCS/wide character conversion functions.

Ändern Zusammenfassung

Diff

--- a/mingwrt/ChangeLog
+++ b/mingwrt/ChangeLog
@@ -1,3 +1,11 @@
1+2020-04-07 Keith Marshall <keith@users.osdn.me>
2+
3+ Document MinGW MBCS/wide character conversion functions.
4+
5+ * man/btowc.3.man man/mbrlen.3.man man/mbrtowc.3.man
6+ * man/mbsinit.3.man man/mbsrtowcs.3.man man/wcrtomb.3.man
7+ * man/wcsrtombs.3.man man/wctob.3.man: New files.
8+
19 2020-04-02 Keith Marshall <keith@users.osdn.me>
210
311 Handle wcsrtombs() initial surrogate completion.
--- /dev/null
+++ b/mingwrt/man/btowc.3.man
@@ -0,0 +1,169 @@
1+.\" vim: ft=nroff
2+.TH %PAGEREF% MinGW "MinGW Programmer's Reference Manual"
3+.
4+.SH NAME
5+.B \%btowc
6+\- convert a single byte to a wide character
7+.
8+.
9+.SH SYNOPSIS
10+.B #include
11+.RB < stdio.h >
12+.br
13+.B #include
14+.RB < wchar.h >
15+.PP
16+.B wint_t btowc( int
17+.I c
18+.B );
19+.
20+.IP \& -4n
21+Feature Test Macro Requirements for libmingwex:
22+.PP
23+.BR \%__MSVCRT_VERSION__ :
24+since \%mingwrt\(hy5.3,
25+if this feature test macro is
26+.IR defined ,
27+with a value of
28+.I at least
29+.IR \%0x0800 ,
30+(corresponding to the symbolic constant,
31+.BR \%__MSCVR80_DLL ,
32+and thus declaring intent to link with \%MSVCR80.DLL,
33+or any later version of \%Microsoft\(aqs \%non\(hyfree runtime library,
34+instead of with \%MSVCRT.DLL),
35+calls to
36+.BR \%btowc ()
37+will be directed to the implementation thereof,
38+within \%Microsoft\(aqs runtime DLL.
39+.
40+.PP
41+.BR \%_ISOC99_SOURCE ,
42+.BR \%_ISOC11_SOURCE :
43+since \%mingwrt\(hy5.3.1,
44+when linking with \%MSVCRT.DLL,
45+or when
46+.B \%__MSVCRT_VERSION__
47+is either
48+.IR undefined ,
49+or is
50+.I defined
51+with any value which is
52+.I less than
53+.IR \%0x0800 ,
54+(thus denying intent to link with \%MSVCR80.DLL,
55+or any later \%non\(hyfree version of Microsoft\(aqs runtime library),
56+.I explicitly
57+defining either of these feature test macros
58+will cause any call to
59+.BR \%btowc ()
60+to be directed to the
61+.I \%libmingwex
62+implementation;
63+if neither macro is defined,
64+calls to
65+.BR \%btowc ()
66+will be directed to Microsoft\(aqs runtime implementation,
67+if it is available,
68+otherwise falling back to the
69+.I \%libmingwex
70+implementation.
71+.
72+.PP
73+Prior to \%mingwrt\(hy5.3,
74+none of the above feature test macros have any effect on
75+.BR \%btowc ();
76+all calls will be directed to the
77+.I \%libmingwex
78+implementation.
79+.
80+.
81+.SH DESCRIPTION
82+If
83+.I c
84+is not
85+.BR EOF ,
86+the
87+.BR \%btowc ()
88+function attempts to interpret
89+.I c
90+as a multibyte character sequence of length
91+.IR one ;
92+if the single byte evaluated represents a complete multibyte character,
93+in the codeset which is associated with the
94+.B \%LC_CTYPE
95+category of the active process locale,
96+.BR \%btowc ()
97+converts it to,
98+and returns,
99+its equivalent wide character value.
100+.
101+.
102+.SH RETURN VALUE
103+If
104+.I c
105+is
106+.BR EOF ,
107+or if it does not represent a complete multibyte
108+character sequence of length
109+.IR one ,
110+.BR \%btowc ()
111+returns
112+.BR WEOF ;
113+otherwise the conversion of the single byte character,
114+to its equivalent wide character value,
115+is returned.
116+.
117+.
118+.SH ERROR CONDITIONS
119+No error conditions are defined.
120+.
121+.
122+.SH STANDARDS CONFORMANCE
123+Except to the extent that it may be affected by limitations
124+of the underlying \%MS\(hyWindows API,
125+the
126+.I \%libmingwex
127+implementation of
128+.BR \%btowc ()
129+conforms generally to
130+.BR \%ISO\(hyC99 ,
131+.BR \%POSIX.1\(hy2001 ,
132+and
133+.BR \%POSIX.1\(hy2008 ;
134+(prior to \%mingwrt\-5.3,
135+and in those cases where calls may be delegated
136+to a Microsoft runtime DLL implementation,
137+this level of conformity may not be achieved).
138+.
139+.
140+.\"SH EXAMPLE
141+.
142+.
143+.SH CAVEATS AND BUGS
144+Use of the
145+.BR \%btowc ()
146+function is
147+.IR discouraged ;
148+it serves no purpose which may not be better served by the
149+.BR \%mbrtowc (3)
150+function,
151+which should be considered as a preferred alternative.
152+.
153+.
154+.SH SEE ALSO
155+.BR mbrtowc (3)
156+.
157+.
158+.SH AUTHOR
159+This manpage was written by \%Keith\ Marshall,
160+\%<keith@users.osdn.me>,
161+to document the
162+.BR \%btowc ()
163+function as it has been implemented for the MinGW.org Project.
164+It may be copied, modified and redistributed,
165+without restriction of copyright,
166+provided this acknowledgement of contribution by
167+the original author remains in place.
168+.
169+.\" EOF
--- /dev/null
+++ b/mingwrt/man/mbrlen.3.man
@@ -0,0 +1,377 @@
1+.\" vim: ft=nroff
2+.TH %PAGEREF% MinGW "MinGW Programmer's Reference Manual"
3+.
4+.SH NAME
5+.B mbrlen
6+\- determine the number of bytes in a multibyte character
7+.
8+.
9+.SH SYNOPSIS
10+.B #include
11+.RB < wchar.h >
12+.PP
13+.B size_t mbrlen( const char
14+.BI * s ,
15+.B size_t
16+.IB n ,
17+.B mbstate_t
18+.BI * ps
19+.B );
20+.
21+.
22+.IP \& -4n
23+Feature Test Macro Requirements for libmingwex:
24+.PP
25+.BR \%__MSVCRT_VERSION__ :
26+since \%mingwrt\(hy5.3,
27+if this feature test macro is
28+.IR defined ,
29+with a value of
30+.I at least
31+.IR \%0x0800 ,
32+(corresponding to the symbolic constant,
33+.BR \%__MSCVR80_DLL ,
34+and thus declaring intent to link with \%MSVCR80.DLL,
35+or any later version of \%Microsoft\(aqs \%non\(hyfree runtime library,
36+instead of with \%MSVCRT.DLL),
37+calls to
38+.BR \%mbrlen ()
39+will be directed to the implementation thereof,
40+within \%Microsoft\(aqs runtime DLL.
41+.
42+.PP
43+.BR \%_ISOC99_SOURCE ,
44+.BR \%_ISOC11_SOURCE :
45+since \%mingwrt\(hy5.3.1,
46+when linking with \%MSVCRT.DLL,
47+or when
48+.B \%__MSVCRT_VERSION__
49+is either
50+.IR undefined ,
51+or is
52+.I defined
53+with any value which is
54+.I less than
55+.IR \%0x0800 ,
56+(thus denying intent to link with \%MSVCR80.DLL,
57+or any later \%non\(hyfree version of Microsoft\(aqs runtime library),
58+.I explicitly
59+defining either of these feature test macros
60+will cause any call to
61+.BR \%mbrlen ()
62+to be directed to the
63+.I \%libmingwex
64+implementation;
65+if neither macro is defined,
66+calls to
67+.BR \%mbrlen ()
68+will be directed to Microsoft\(aqs runtime implementation,
69+if it is available,
70+otherwise falling back to the
71+.I \%libmingwex
72+implementation.
73+.
74+.PP
75+Prior to \%mingwrt\(hy5.3,
76+none of the above feature test macros have any effect on
77+.BR \%mbrlen ();
78+all calls will be directed to the
79+.I \%libmingwex
80+implementation.
81+.
82+.
83+.SH DESCRIPTION
84+The
85+.BR \%mbrlen ()
86+function inspects the sequence of bytes,
87+starting at
88+.IR s ,
89+up to a maximum of
90+.I n
91+bytes,
92+to determine the number of bytes required to complete
93+the next multibyte code point,
94+commencing from the conversion state specified in
95+.IR *ps ,
96+(which is then updated).
97+.
98+.PP
99+The sequence of bytes,
100+pointed to by
101+.IR s ,
102+is interpreted as a multibyte character sequence
103+in the codeset which is associated with the
104+.B \%LC_CTYPE
105+category of the active process locale.
106+.
107+.PP
108+If
109+.I ps
110+is specified as a NULL pointer,
111+.BR \%mbrlen ()
112+will track conversion state using an internal
113+.B \%mbstate_t
114+object reference,
115+which is private within the
116+.BR \%mbrlen ()
117+process address space;
118+at process \%start\(hyup,
119+this internal
120+.B \%mbstate_t
121+object is initialized to represent
122+the initial conversion state.
123+.
124+.
125+.SH RETURN VALUE
126+If the multibyte sequence,
127+completed by
128+.I n
129+or fewer bytes,
130+does not represent the NUL code point,
131+then
132+.BR \%mbrlen ()
133+returns the number of bytes which are actually required
134+to complete the sequence,
135+(a number between 1 and
136+.IR n ,
137+inclusive),
138+and the conversion state,
139+as specified in
140+.IR *ps ,
141+is reset to the initial state.
142+.
143+.PP
144+On the other hand,
145+if the completed multibyte sequence
146+.I does
147+represent the NUL code point,
148+then
149+.BR \%mbrlen ()
150+returns zero,
151+and the conversion state,
152+as specified in
153+.IR *ps ,
154+is reset to the initial state.
155+.
156+.PP
157+If
158+.I n
159+is less than the effective
160+.B \%MB_CUR_MAX
161+for the active process locale,
162+and
163+.I n
164+bytes is insufficient to complete a multibyte character,
165+then
166+.I *ps
167+is updated to represent a new partially completed encoding state,
168+and
169+.BR \%mbrlen ()
170+returns
171+.IR \%(size_t)(\-2) .
172+Conversely,
173+if
174+.I n
175+is equal to,
176+or greater than
177+.BR \%MB_CUR_MAX ,
178+this return condition can arise,
179+only if the multibyte encoding sequence includes
180+redundant shift states;
181+since shift states are not used,
182+this cannot occur in any \%MS\(hyWindows
183+multibyte character set.
184+.
185+.
186+.SH ERROR CONDITIONS
187+If the sequence of
188+.I n
189+or fewer bytes,
190+pointed to by
191+.IR s ,
192+extends any pending encoding state recorded within
193+.IR *ps ,
194+to at least
195+.B \%MB_CUR_MAX
196+bytes,
197+and the resulting sequence does not represent
198+a valid multibyte character,
199+then
200+.I \%errno
201+is set to
202+.BR \%EILSEQ ,
203+and
204+.BR \%mbrlen ()
205+returns
206+.IR \%(size_t)(\-1) .
207+.
208+.PP
209+If,
210+on entry to
211+.BR \%mbrlen (),
212+the conversion state represented by
213+.I *ps
214+is deemed to be
215+.IR invalid ,
216+.I \%errno
217+is set to
218+.BR \%EINVAL ,
219+and
220+.BR \%mbrlen ()
221+returns
222+.IR \%(size_t)(\-1) ;
223+the conversion state may be deemed to be invalid if
224+it contains any sequence of bytes which does not match
225+a valid initial sequence from a multibyte character
226+representation within the currently active codeset,
227+if it can be interpreted as a complete multibyte character,
228+.I without
229+the addition of any further bytes from
230+.IR s ,
231+or if it represents a
232+.I surrogate\ pair
233+conversion,
234+resulting from a preceding call to the
235+.BR \%mbrtowc (3)
236+function,
237+and from which the
238+.I low\ surrogate
239+has yet to be retrieved.
240+.
241+.
242+.SH STANDARDS CONFORMANCE
243+Except to the extent that it may be affected by limitations
244+of the underlying \%MS\(hyWindows API,
245+the
246+.I \%libmingwex
247+implementation of
248+.BR \%mbrlen ()
249+conforms generally to
250+.BR \%ISO\(hyC99 ,
251+.BR \%POSIX.1\(hy2001 ,
252+and
253+.BR \%POSIX.1\(hy2008 ;
254+(prior to \%mingwrt\-5.3 ,
255+and in those cases where calls may be delegated
256+to a Microsoft runtime DLL implementation,
257+this level of conformity may not be achieved).
258+.
259+.PP
260+The feature whereby
261+.I \%errno
262+is set to
263+.BR EINVAL ,
264+when
265+.I *ps
266+is found to be invalid,
267+is a
268+.B POSIX.1
269+conforming extension to
270+.BR \%ISO\(hyC99 .
271+.
272+.
273+.\"SH EXAMPLE
274+.
275+.
276+.SH CAVEATS AND BUGS
277+If
278+.BR \%mbrlen ()
279+is called with a NULL pointer for
280+.IR s ,
281+the behaviour is undefined.
282+.
283+.PP
284+Due to a documented limitation of Microsoft\(aqs
285+.BR \%setlocale ()
286+function implementation,
287+it is not possible to directly select an active locale,
288+in which the codeset is represented by any multibyte
289+character sequence with an effective
290+.B \%MB_CUR_MAX
291+of more than two bytes.
292+Prior to \%mingwrt\(hy5.3,
293+this limitation precludes the use of
294+.BR \%mbrlen ()
295+to interpret any codeset with
296+.B \%MB_CUR_MAX
297+greater than two bytes,
298+(such as
299+.BR \%UTF\(hy8 ).
300+From \%mingwrt\(hy5.3 onward,
301+the MinGW.org implementation of
302+.BR \%mbrlen ()
303+mitigates this limitation by assignment of the codeset
304+from the
305+.B \%LC_CTYPE
306+environment variable,
307+provided the system default has been previously activated
308+for the
309+.B \%LC_CTYPE
310+locale category;
311+e.g.\ execution of:
312+.PP
313+.RS 4
314+.EX
315+#define _ISOC99_SOURCE
316+
317+#include <stdio.h>
318+#include <stdlib.h>
319+#include <locale.h>
320+#include <limits.h>
321+#include <wchar.h>
322+
323+int main()
324+{
325+ setlocale( LC_CTYPE, "" );
326+ putenv( "LC_CTYPE=en_GB.65001" );
327+ printf( "%u bytes\en",
328+ mbrlen( "\eU0001d10b", MB_LEN_MAX, NULL )
329+ );
330+ return 0;
331+}
332+.EE
333+.RE
334+.PP
335+will interpret the string \fC\%"\eU0001d10b"\fP as a \%four\(hybyte
336+.B \%UTF\(hy8
337+encoding sequence,
338+(which represents a single code point),
339+and print the result as \fC4\fP\ \fC\%bytes\fP.
340+.
341+.PP
342+Please be aware that the underlying \%MS\(hyWindows API,
343+which is used to interpret the multibyte sequence,
344+offers no readily accessible mechanism to discriminate
345+between incomplete and invalid sequences;
346+thus,
347+if
348+.I n
349+is less than the effective
350+.B \%MB_CUR_MAX
351+for the active codeset,
352+this
353+.BR \%mbrlen ()
354+implementation may return
355+.IR \%(size_t)(\-2) ,
356+indicating an incomplete sequence,
357+even in cases where there are no additional bytes
358+which could be appended,
359+to complete a valid encoding sequence.
360+.
361+.
362+.SH SEE ALSO
363+.BR mbrtowc (3)
364+.
365+.
366+.SH AUTHOR
367+This manpage was written by \%Keith\ Marshall,
368+\%<keith@users.osdn.me>,
369+to document the
370+.BR \%mbrlen ()
371+function as it has been implemented for the MinGW.org Project.
372+It may be copied, modified and redistributed,
373+without restriction of copyright,
374+provided this acknowledgement of contribution by
375+the original author remains in place.
376+.
377+.\" EOF
--- /dev/null
+++ b/mingwrt/man/mbrtowc.3.man
@@ -0,0 +1,681 @@
1+.\" vim: ft=nroff
2+.TH %PAGEREF% MinGW "MinGW Programmer's Reference Manual"
3+.
4+.SH NAME
5+.B mbrtowc
6+\- convert from multibyte to wide character encoding
7+.
8+.
9+.SH SYNOPSIS
10+.B #include
11+.RB < wchar.h >
12+.PP
13+.B size_t mbrtowc( wchar_t
14+.BI * pwc ,
15+.B const char
16+.BI * s ,
17+.B size_t
18+.IB n ,
19+.B mbstate_t
20+.BI * ps
21+.B );
22+.
23+.IP \& -4n
24+Feature Test Macro Requirements for libmingwex:
25+.PP
26+.BR \%__MSVCRT_VERSION__ :
27+since \%mingwrt\(hy5.3,
28+if this feature test macro is
29+.IR defined ,
30+with a value of
31+.I at least
32+.IR 0x0800 ,
33+(corresponding to the symbolic constant,
34+.BR \%__MSCVR80_DLL ,
35+and thus declaring intent to link with \%MSVCR80.DLL,
36+or any later version of \%Microsoft\(aqs \%non\(hyfree runtime library,
37+instead of with \%MSVCRT.DLL),
38+calls to
39+.BR mbrtowc ()
40+will be directed to the implementation thereof,
41+within \%Microsoft\(aqs runtime DLL.
42+.
43+.PP
44+.BR \%_ISOC99_SOURCE ,
45+.BR \%_ISOC11_SOURCE :
46+since \%mingwrt\(hy5.3.1,
47+when linking with \%MSVCRT.DLL,
48+or when
49+.B \%__MSVCRT_VERSION__
50+is either
51+.IR undefined ,
52+or is
53+.I defined
54+with any value which is
55+.I less than
56+.IR 0x0800 ,
57+(thus denying intent to link with \%MSVCR80.DLL,
58+or any later \%non\(hyfree version of Microsoft\(aqs runtime library),
59+.I explicitly
60+defining either of these feature test macros
61+will cause any call to
62+.BR \%mbrtowc ()
63+to be directed to the
64+.I \%libmingwex
65+implementation;
66+if neither macro is defined,
67+calls to
68+.BR \%mbrtowc ()
69+will be directed to Microsoft\(aqs runtime implementation,
70+if it is available,
71+otherwise falling back to the
72+.I \%libmingwex
73+implementation.
74+.
75+.PP
76+Prior to \%mingwrt\(hy5.3,
77+none of the above feature test macros have any effect on
78+.BR \%mbrtowc ();
79+all calls will be directed to the
80+.I \%libmingwex
81+implementation.
82+.
83+.
84+.SH DESCRIPTION
85+If
86+.I s
87+is a NULL pointer,
88+the
89+.IR *pwc ,
90+and the
91+.I n
92+arguments are ignored,
93+and the call to
94+.BR \%mbrtowc ()
95+function is interpreted as if invoked as
96+.PP
97+.RS 4n
98+.EX
99+mbrtowc( NULL, "", 1, ps );
100+.EE
101+.RE
102+.
103+.PP
104+Otherwise,
105+if
106+.I s
107+is not a NULL pointer,
108+the
109+.BR \%mbrtowc ()
110+function inspects the sequence of bytes,
111+starting at
112+.IR s ,
113+up to a maximum of
114+.I n
115+bytes,
116+to determine the number of bytes required to complete
117+the next multibyte code point,
118+commencing from the conversion state specified in
119+.IR *ps ,
120+(which is then updated).
121+Then,
122+if
123+.I *pwc
124+is not a NULL pointer,
125+and
126+.I n
127+or fewer bytes is sufficient to complete a single
128+multibyte character,
129+the single
130+.B \%wchar_t
131+wide character conversion of that multibyte character
132+is stored at
133+.IR *pwc .
134+.
135+.PP
136+The sequence of bytes,
137+pointed to by
138+.IR s ,
139+is interpreted as a multibyte character sequence
140+in the codeset which is associated with the
141+.B \%LC_CTYPE
142+category of the active process locale.
143+.
144+.PP
145+If
146+.I ps
147+is specified as a NULL pointer,
148+.BR \%mbrtowc ()
149+will track conversion state using an internal
150+.B \%mbstate_t
151+object reference,
152+which is private within the
153+.BR \%mbrtowc ()
154+process address space;
155+at process \%start\(hyup,
156+this internal
157+.B \%mbstate_t
158+object is initialized to represent
159+the initial conversion state.
160+.
161+.PP
162+In the special case,
163+where the conversion of a completed multibyte character
164+must be represented as a
165+.B \%UTF\(hy16LE
166+.IR surrogate\ pair ,
167+and
168+.I *pwc
169+is not a NULL pointer,
170+only the
171+.I high\ surrogate
172+will be stored at
173+.IR *pwc ;
174+please refer to the section
175+.B CAVEATS AND
176+.BR BUGS ,
177+below,
178+for advice on retrieval of the
179+.IR low\ surrogate .
180+.
181+.
182+.SH RETURN VALUE
183+If the multibyte sequence,
184+completed by
185+.I n
186+or fewer bytes,
187+does not represent the NUL code point,
188+then
189+.BR \%mbrtowc ()
190+returns the number of bytes which are actually required
191+to complete the sequence,
192+(a number between 1 and
193+.IR n ,
194+inclusive),
195+and the conversion state,
196+as specified in
197+.IR *ps ,
198+is reset to the initial state;
199+if
200+.I pwc
201+is not a NULL pointer,
202+the wide character conversion of the completed
203+multibyte character is stored at
204+.IR *pwc .
205+.
206+.PP
207+On the other hand,
208+if the completed multibyte sequence
209+.I does
210+represent the NUL code point,
211+then
212+.BR \%mbrtowc ()
213+returns zero,
214+and the conversion state,
215+as specified in
216+.IR *ps ,
217+is reset to the initial state;
218+if
219+.I pwc
220+is not a NULL pointer,
221+the NUL wide character is stored at
222+.IR *pwc .
223+.
224+.PP
225+If
226+.I n
227+is less than the effective
228+.B \%MB_CUR_MAX
229+for the active process locale,
230+and
231+.I n
232+bytes is insufficient to complete a multibyte character,
233+then
234+.I *ps
235+is updated to represent a new partially completed encoding state,
236+(no wide character conversion is stored),
237+and
238+.BR \%mbrtowc ()
239+returns
240+.IR \%(size_t)(\-2) .
241+(If
242+.I n
243+is equal to,
244+or greater than
245+.BR \%MB_CUR_MAX ,
246+this return condition can arise,
247+only if the multibyte encoding sequence includes
248+redundant shift states;
249+since shift states are not used,
250+this cannot occur in any \%MS\(hyWindows
251+multibyte character set).
252+.
253+.
254+.SH ERROR CONDITIONS
255+If the sequence of
256+.I n
257+or fewer bytes,
258+pointed to by
259+.IR s ,
260+extends any pending encoding state recorded within
261+.IR *ps ,
262+to at least
263+.B \%MB_CUR_MAX
264+bytes,
265+and the resulting sequence does not represent
266+a valid multibyte character,
267+then
268+.I \%errno
269+is set to
270+.BR \%EILSEQ ,
271+no wide character conversion is stored,
272+and
273+.BR \%mbrtowc ()
274+returns
275+.IR \%(size_t)(\-1) .
276+.
277+.PP
278+If,
279+on entry to
280+.BR \%mbrtowc (),
281+the conversion state represented by
282+.I *ps
283+is deemed to be
284+.IR invalid ,
285+.I \%errno
286+is set to
287+.BR \%EINVAL ,
288+and
289+.BR \%mbrtowc ()
290+returns
291+.IR \%(size_t)(\-1) ;
292+the conversion state may be deemed to be invalid if
293+it contains any sequence of bytes which does not match
294+a valid initial sequence from a multibyte character
295+representation within the currently active codeset,
296+if it can be interpreted as a complete multibyte character,
297+.I without
298+the addition of any further bytes from
299+.IR s ,
300+or if it represents a
301+.I surrogate\ pair
302+conversion,
303+resulting from a preceding call to
304+.BR \%mbrtowc (),
305+from which the
306+.I low\ surrogate
307+has yet to be retrieved,
308+(and this is not the special case in which
309+.I n
310+is specified as
311+.IR \%zero ,
312+indicating that this call is intended
313+to retrieve that pending
314+.IR low\ surrogate ).
315+.
316+.
317+.SH STANDARDS CONFORMANCE
318+Except in respect of its extended provision for handling of
319+.IR surrogate\ pairs ,
320+and to the extent that it may be affected by limitations
321+of the underlying \%MS\(hyWindows API,
322+the
323+.I \%libmingwex
324+implementation of
325+.BR mbrtowc ()
326+conforms generally to
327+.BR \%ISO\(hyC99 ,
328+.BR \%POSIX.1\(hy2001 ,
329+and
330+.BR \%POSIX.1\(hy2008 ;
331+(prior to \%mingwrt\-5.3,
332+and in those cases where calls may be delegated
333+to a Microsoft runtime DLL implementation,
334+this level of conformity may not be achieved).
335+.
336+.PP
337+The feature whereby
338+.I \%errno
339+is set to
340+.BR EINVAL ,
341+when
342+.I *ps
343+is found to be invalid,
344+is a
345+.B POSIX.1
346+conforming extension to
347+.BR \%ISO\(hyC99 .
348+.
349+.
350+.\"SH EXAMPLE
351+.
352+.
353+.SH CAVEATS AND BUGS
354+Due to a documented limitation of Microsoft\(aqs
355+.BR \%setlocale ()
356+function implementation,
357+it is not possible to directly select an active locale,
358+in which the codeset is represented by any multibyte
359+character sequence with an effective
360+.B \%MB_CUR_MAX
361+of more than two bytes.
362+Prior to \%mingwrt\(hy5.3,
363+this limitation precludes the use of
364+.BR \%mbrtowc ()
365+to interpret any codeset with
366+.B \%MB_CUR_MAX
367+greater than two bytes,
368+(such as
369+.BR \%UTF\(hy8 ).
370+From \%mingwrt\(hy5.3 onward,
371+the MinGW.org implementation of
372+.BR \%mbrtowc ()
373+mitigates this limitation by assignment of the codeset
374+from the
375+.B \%LC_CTYPE
376+environment variable,
377+provided the system default has been previously activated
378+for the
379+.B \%LC_CTYPE
380+locale category;
381+e.g.\ execution of:
382+.PP
383+.RS 4n
384+.EX
385+#include <stdio.h>
386+#include <stdlib.h>
387+#include <locale.h>
388+#include <limits.h>
389+#include <wchar.h>
390+
391+void print_conv( const char * );
392+
393+int main()
394+{
395+ setlocale( LC_CTYPE, "" );
396+ putenv( "LC_CTYPE=en_GB.65001" );
397+ print_conv( "\eU0001d10b" );
398+ print_conv( "\eu6c34" );
399+ return 0;
400+}
401+
402+void print_conv( const char *mbs )
403+{
404+ wchar_t wch;
405+ size_t n = mbrtowc( &wch, mbs, MB_LEN_MAX, NULL );
406+ if( (int)(n) > 0 ) printf( "%u bytes \-> 0x%04X\en", n, wch );
407+ else if( n == (size_t)(\-1) ) perror( "mbrtowc" );
408+}
409+.EE
410+.RE
411+.PP
412+will interpret the string \fC"\eU0001d10b"\fP as a \%four\(hybyte
413+.B \%UTF\(hy8
414+encoding sequence,
415+(which represents a single Unicode code point),
416+but will fail to interpret the following \fC"\eu6c34"\fP sequence,
417+(which also represents a valid Unicode code point),
418+and,
419+(if
420+.B stderr
421+is redirected to
422+.BR stdout ),
423+will print the result as:
424+.PP
425+.RS 4n
426+.EX
427+4 bytes \-> 0xD834
428+mbrtowc: Invalid argument
429+.EE
430+.RE
431+.PP
432+This example illustrates a potentially irreconcilable
433+deviation of any
434+.BR \%mbrtowc ()
435+implementation,
436+on \%MS\(hyWindows,
437+from the
438+.B \%ISO\(hyC99
439+standard:
440+due to \%Microsoft\(aqs choice of
441+.B \%UTF\(hy16LE
442+as the underlying representation of the
443+.B \%wchar_t
444+data type,
445+it is not possible to satisfy the requirement,
446+implicit in the
447+.B \%ISO\(hyC99
448+specification for
449+.BR \%mbrtowc (),
450+that it should be possible to return the complete representation
451+of any single representable Unicode code point as a single
452+.B \%wchar_t
453+value.
454+In the case of this example,
455+whereas the \%4\(hybyte
456+.B \%UTF\(hy8
457+representation of the \fC\%"\eU0001d10b"\fP Unicode code point
458+.I is
459+complete,
460+the \fC\%0xD834\fP
461+.B \%wchar_t
462+representation,
463+as returned by
464+.BR \%mbrtowc (),
465+is
466+.I not
467+complete;
468+it represents a
469+.B \%UTF\(hy16
470+.IR high\ surrogate ,
471+which
472+.I must
473+be paired with a corresponding
474+.I low\ surrogate
475+to complete it,
476+and,
477+since
478+.B \%ISO\(hyC99
479+requires that the
480+.B \%*pwc
481+argument to
482+.BR \%mbrtowc ()
483+refers to sufficient storage space to accommodate only
484+.I one
485+.B \%wchar_t
486+value,
487+it is not possible for
488+.BR \%mbrtowc ()
489+to
490+.I safely
491+return
492+.I both
493+the
494+.IR high\ surrogate ,
495+and its complementary
496+.IR low\ surrogate ,
497+in a single call.
498+To mitigate this non\(hyconformance,
499+from \%mingwrt\(hy5.3 onward,
500+the \%MinGW implementation of
501+.BR \%mbrtowc ()
502+supports the following non\(hystandard strategy
503+for completion of any conversion which requires return of a
504+.IR surrogate\ pair :
505+.
506+.RS 2n
507+.ll -2n
508+.IP \(bu 2n
509+Any translation unit,
510+in which
511+.BR \%mbrtowc ()
512+is called,
513+should:
514+.RS 2n
515+.ll -2n
516+.IP a) 3n
517+explicitly define either the
518+.BR \%_ISOC99_SOURCE ,
519+or the
520+.B \%_ISOC11_SOURCE
521+feature test macro,
522+(with any arbitrary value,
523+or even no value),
524+.B before
525+including
526+.I any
527+header file,
528+and
529+.IP b) 3n
530+include the
531+.B \%<winnls.h>
532+header file,
533+in addition to the required
534+.B \%<wchar.h>
535+header.
536+.ll +2n
537+.RE
538+.
539+.IP \(bu 2n
540+Following each call of
541+.BR \%mbrtowc (),
542+which returns a
543+.B \%wchar_t
544+value with a converted byte count greater than zero,
545+test the returned
546+.B \%wchar_t
547+value,
548+using the
549+.BR \%IS_HIGH_SURROGATE ()
550+macro.
551+.
552+.IP \(bu 2
553+When the
554+.BR \%IS_HIGH_SURROGATE ()
555+macro call indicates that the returned
556+.B \%wchar_t
557+value does represent a
558+.IR high\ surrogate ,
559+immediately call
560+.BR mbrtowc ()
561+again,
562+passing the
563+.B \%*ps
564+state as returned by the original call,
565+together with the original multibyte sequence reference,
566+but with an explicit scan length limit,
567+.BR \%n ,
568+of zero,
569+and an alternative
570+.B \%wchar_t
571+buffer reference pointer,
572+for storage of the
573+.IR low\ surrogate ;
574+on successful retrieval of this
575+.IR low\ surrogate ,
576+the additional converted byte count will be returned as zero,
577+and the pending
578+.B \%*ps
579+conversion state will have been cleared,
580+(i.e.\& reset to the initial state).
581+.ll +2n
582+.RE
583+.
584+.PP
585+Thus,
586+considering the preceding example,
587+to support interpretation of
588+.I surrogate pairs
589+the example code should be modified by insertion of:
590+.PP
591+.RS 4n
592+.EX
593+#define _ISOC99_SOURCE
594+#include <winnls.h>
595+.EE
596+.RE
597+.PP
598+at the top of the source file,
599+and reimplementation of the
600+.BR print_conv ()
601+function,
602+to incorporate the
603+.BR IS_HIGH_SURROGATE ()
604+test,
605+and response:
606+.PP
607+.RS 4n
608+.EX
609+void print_conv( const char *mbs )
610+{
611+ wchar_t wch;
612+ size_t n = mbrtowc( &wch, mbs, MB_LEN_MAX, NULL );
613+ if( (int)(n) > 0 )
614+ {
615+ if( IS_HIGH_SURROGATE( wch )
616+ {
617+ wchar_t wcl;
618+ mbrtowc( &wcl, mbs, 0, NULL );
619+ printf( "%u bytes \-> 0x%04X:0x%04X\en", n, wch, wcl );
620+ }
621+ else printf( "%u bytes \-> 0x%04X\en", n, wch );
622+ }
623+ else if( n == (size_t)(\-1) ) perror( "mbrtowc" );
624+}
625+.EE
626+.RE
627+.
628+.PP
629+With these changes in place,
630+the output from the program becomes:
631+.PP
632+.RS 4n
633+.EX
634+4 bytes \-> 0xD834:0xDD0B
635+3 bytes \-> 0x6C34
636+.EE
637+.RE
638+.PP
639+thus now correctly reporting the conversion of the
640+.IR surrogate\ pair ,
641+and then correctly interpreting the following \%3-byte
642+.B \%UTF\(hy8
643+sequence.
644+.
645+.PP
646+Please be aware that the underlying \%MS\(hyWindows API,
647+which is used to interpret the multibyte sequence,
648+offers no readily accessible mechanism to discriminate
649+between incomplete and invalid sequences;
650+thus,
651+if
652+.I n
653+is less than the effective
654+.B \%MB_CUR_MAX
655+for the active codeset,
656+this
657+.BR \%mbrtowc ()
658+implementation may return
659+.IR \%(size_t)(\-2) ,
660+indicating an incomplete sequence,
661+even in cases where there are no additional bytes
662+which could be appended,
663+to complete a valid encoding sequence.
664+.
665+.
666+.SH SEE ALSO
667+.BR mbsrtowcs (3)
668+.
669+.
670+.SH AUTHOR
671+This manpage was written by \%Keith\ Marshall,
672+\%<keith@users.osdn.me>,
673+to document the
674+.BR \%mbrtowc ()
675+function as it has been implemented for the MinGW.org Project.
676+It may be copied, modified and redistributed,
677+without restriction of copyright,
678+provided this acknowledgement of contribution by
679+the original author remains in place.
680+.
681+.\" EOF
--- /dev/null
+++ b/mingwrt/man/mbsinit.3.man
@@ -0,0 +1,262 @@
1+.\" vim: ft=nroff
2+.TH %PAGEREF% MinGW "MinGW Programmer's Reference Manual"
3+.
4+.SH NAME
5+.B \%mbsinit
6+\- check state of multibyte to wide character conversion
7+.
8+.
9+.SH SYNOPSIS
10+.B #include
11+.RB < wchar.h >
12+.PP
13+.B int mbsinit( mbstate_t
14+.BI * ps
15+.B );
16+.
17+.
18+.SH DESCRIPTION
19+If
20+.I ps
21+is not a NULL pointer,
22+the
23+.BR \%mbsinit ()
24+function determines whether the
25+.B \%mbstate_t
26+object,
27+to which it points,
28+represents a multibyte to wide character conversion in the
29+.IR initial ,
30+or in an
31+.I intermediate
32+state.
33+.
34+.PP
35+The
36+.I initial
37+conversion state is represented by a
38+.I zero\(hyvalued
39+.B \%mbstate_t
40+object.
41+(POSIX.1 stipulates that this representation must be supported,
42+although additional alternative representations are permitted;
43+MinGW uses only the zero\(hyvalued representation).
44+.
45+.PP
46+In MinGW,
47+an initial conversion state may be establised by initialization:
48+.PP
49+.RS 4n
50+.EX
51+mbstate_t st = (mbstate_t)(0), *ps = &st;
52+.EE
53+.RE
54+.PP
55+or by assignment:
56+.PP
57+.RS 4n
58+.EX
59+*ps = (mbstate_t)(0);
60+.EE
61+.RE
62+.PP
63+However,
64+for portability:
65+.PP
66+.RS 4n
67+.EX
68+memset( ps, 0, sizeof( mbstate_t ));
69+.EE
70+.RE
71+.PP
72+may be preferred.
73+.
74+.PP
75+Nominally,
76+.B \%mbstate_t
77+objects represent
78+.I shift states
79+of the active codeset.
80+However,
81+since \%MS\(hyWindows codesets do not use shift states,
82+as such,
83+MinGW uses
84+.B \%mbsinit_t
85+odjects to represent an alternative class of
86+.I intermediate conversion
87+.IR states ,
88+viz.:
89+.RS 2n
90+.ll -2n
91+.IP \(bu 2n
92+Parsing of a multibyte sequence has been interrupted,
93+before interpretation of
94+.B \%MB_CUR_MAX
95+bytes,
96+without identification of a complete code point;
97+this conversion state may arise following a call of
98+.BR mbrlen (3),
99+or
100+.BR mbrtowc (3),
101+which has returned a parsed sequence length of
102+.IR \%(size_t)(\-2) .
103+.
104+.IP \(bu 2n
105+Processing of a wide character sequence has encountered a
106+.IR high\ surrogate ,
107+but the complementary
108+.I low surrogate
109+has yet to be evaluated;
110+this state may arise after a call of
111+.BR mbrtowc (3),
112+has returned the
113+.IR high\ surrogate ,
114+(with a returned sequence length between
115+.I one
116+and
117+.BR \%MB_CUR_MAX ),
118+and a further call is needed,
119+to retrieve the
120+.IR low\ surrogate ;
121+alternatively,
122+a complementary conversion state may arise when
123+.BR wcrtomb (3)
124+has been called to interpret a
125+.IR high\ surrogate ,
126+and a further call,
127+to complete the conversion to a multibyte sequence,
128+by evaluation of the complementary
129+.IR low\ surrogate ,
130+is still required.
131+.ll +2n
132+.RE
133+.
134+.
135+.SH RETURN VALUE
136+If
137+.I ps
138+is a NULL pointer,
139+or if the conversion state,
140+represented by the
141+.B \%mbstate_t
142+object to which it points,
143+is the
144+.I initial
145+state,
146+.BR \%mbsinit ()
147+returns a
148+.I \%non\(hyzero
149+value;
150+otherwise,
151+.I \%zero
152+is returned,
153+indicating an
154+.I intermediate
155+conversion state.
156+.
157+.
158+.SH ERROR CONDITIONS
159+No error conditions are defined.
160+.
161+.
162+.SH STANDARDS CONFORMANCE
163+There is no Microsoft implementation of the
164+.BR mbsinit ()
165+function,
166+which is readily accessible for use in MinGW applications;
167+the
168+.I \%libmingwex
169+implementation conforms generally to
170+.BR \%ISO\(hyC99 ,
171+.BR \%POSIX.1\(hy2001 ,
172+and
173+.BR \%POSIX.1\(hy2008 .
174+.
175+.
176+.\"SH EXAMPLE
177+.
178+.
179+.SH CAVEATS AND BUGS
180+Prior to \%mingwrt\(hy5.3,
181+the
182+.I \%libmingwex
183+implementation of
184+.BR mbsinit ()
185+would always return
186+.IR \%non\(hyzero ,
187+apparently indicating an
188+.I initial
189+conversion state,
190+regardless of the actual state indicated by any
191+.B \%mbstate_t
192+object referred to by
193+.IR *ps ;
194+this defect is corrected,
195+in \%mingwrt\(hy5.3.
196+.
197+.PP
198+Any
199+.I intermediate conversion
200+.IR state ,
201+arising from a call to
202+.BR mbrlen (3),
203+.BR mbrtowc (3),
204+or
205+.BR wcrtomb (3),
206+is specific to the particular conversion which produces it.
207+Any intermediate state produced by
208+.BR mbrlen (3),
209+or by
210+.BR mbrtowc (3)
211+may be resolved by a further call to either of these two functions,
212+or to
213+.BR mbsrtowcs (3),
214+provided the initial part of the multibyte sequence,
215+passed in the subsequent call,
216+completes the sequence which led to the intermediate state;
217+if this intermediate state is used in any other context,
218+the consequent behaviour is undefined.
219+.
220+.PP
221+Similarly,
222+an intermediate state resulting from a call to
223+.BR wcrtomb (3)
224+may be resolved by a further call to
225+.BR wcrtomb (3),
226+or to
227+.BR wcsrtomb (3),
228+provided the first,
229+(or the only),
230+wide character to be interpreted,
231+in the subsequent call,
232+represents the
233+.I low surrogate
234+which completes the pending
235+.I surrogate pair
236+from which the intermediate state was created.
237+Once again,
238+if this intermediate state is used in any other context,
239+the consequent behaviour is undefined.
240+.
241+.
242+.SH SEE ALSO
243+.BR \%mbrlen (3),
244+.BR \%mbrtowc (3),
245+.BR \%mbsrtowcs (3),
246+.BR \%wcrtomb (3),
247+and
248+.BR \%wcrtomb (3).
249+.
250+.
251+.SH AUTHOR
252+This manpage was written by \%Keith\ Marshall,
253+\%<keith@users.osdn.me>,
254+to document the
255+.BR \%mbsinit ()
256+function as it has been implemented for the MinGW.org Project.
257+It may be copied, modified and redistributed,
258+without restriction of copyright,
259+provided this acknowledgement of contribution by
260+the original author remains in place.
261+.
262+.\" EOF
--- /dev/null
+++ b/mingwrt/man/mbsrtowcs.3.man
@@ -0,0 +1,521 @@
1+.\" vim: ft=nroff
2+.TH %PAGEREF% MinGW "MinGW Programmer's Reference Manual"
3+.
4+.SH NAME
5+.B mbsrtowcs
6+\- convert from multibyte to wide character string
7+.
8+.
9+.SH SYNOPSIS
10+.B #include
11+.RB < wchar.h >
12+.PP
13+.B size_t mbsrtowcs( wchar_t
14+.BI * dst ,
15+.B const char
16+.BI ** src ,
17+.B size_t
18+.IB len ,
19+.B mbstate_t
20+.BI * ps
21+.B );
22+.
23+.IP \& -4n
24+Feature Test Macro Requirements for libmingwex:
25+.PP
26+.BR \%__MSVCRT_VERSION__ :
27+since \%mingwrt\(hy5.3,
28+if this feature test macro is
29+.IR defined ,
30+with a value of
31+.I at least
32+.IR 0x0800 ,
33+(corresponding to the symbolic constant,
34+.BR \%__MSCVR80_DLL ,
35+and thus declaring intent to link with \%MSVCR80.DLL,
36+or any later version of \%Microsoft\(aqs \%non\(hyfree runtime library,
37+instead of with \%MSVCRT.DLL),
38+calls to
39+.BR mbsrtowcs ()
40+will be directed to the implementation thereof,
41+within \%Microsoft\(aqs runtime DLL.
42+.
43+.PP
44+.BR \%_ISOC99_SOURCE ,
45+.BR \%_ISOC11_SOURCE :
46+since \%mingwrt\(hy5.3.1,
47+when linking with \%MSVCRT.DLL,
48+or when
49+.B \%__MSVCRT_VERSION__
50+is either
51+.IR undefined ,
52+or is
53+.I defined
54+with any value which is
55+.I less than
56+.IR 0x0800 ,
57+(thus denying intent to link with \%MSVCR80.DLL,
58+or any later \%non\(hyfree version of Microsoft\(aqs runtime library),
59+.I explicitly
60+defining either of these feature test macros
61+will cause any call to
62+.BR \%mbsrtowcs ()
63+to be directed to the
64+.I \%libmingwex
65+implementation;
66+if neither macro is defined,
67+calls to
68+.BR \%mbsrtowcs ()
69+will be directed to Microsoft\(aqs runtime implementation,
70+if it is available,
71+otherwise falling back to the
72+.I \%libmingwex
73+implementation.
74+.
75+.PP
76+Prior to \%mingwrt\(hy5.3,
77+none of the above feature test macros have any effect on
78+.BR \%mbsrtowcs ();
79+all calls will be directed to the
80+.I \%libmingwex
81+implementation.
82+.
83+.
84+.SH DESCRIPTION
85+.PP
86+Commencing from the conversion state specified in
87+.IR *ps ,
88+the
89+.BR \%mbsrtowcs ()
90+function converts the multibyte character sequence,
91+starting at
92+.IR *src ,
93+to a sequence of wide characters;
94+each conversion is performed as if by calling the
95+.BR mbrtowc (3)
96+function.
97+.
98+.PP
99+If
100+.I dst
101+is not a NULL pointer,
102+the resulting sequence of wide characters,
103+up to a maximum of
104+.I len
105+in number,
106+will be stored as a wide character string,
107+starting at
108+.IR dst ;
109+conversion may be curtailed,
110+before
111+.I len
112+wide characters have been stored,
113+under any of the following conditions:
114+.RS 2n
115+.ll -2n
116+.IP \(bu 2n
117+The result of any one conversion represents the NUL wide character,
118+(in which case the NUL wide character is stored,
119+but is not included in the count of characters converted).
120+.
121+.IP \(bu 2n
122+The result of any single multibyte character conversion is a
123+.IR surrogate\ pair ,
124+but the available space,
125+remaining in the conversion buffer,
126+is insufficient to accommodate more than one
127+.B \%wchar_t
128+value.
129+.
130+.IP \(bu 2n
131+An invalid multibyte character sequence is encountered,
132+(in which case the conversion state becomes undefined).
133+.ll +2n
134+.RE
135+.
136+.PP
137+Conversely,
138+if
139+.I dst
140+is a NULL pointer,
141+the
142+.I len
143+argument is ignored,
144+and conversions are performed until either
145+the multibyte equivalent of the NUL character,
146+or an invalid multibyte sequence is encountered,
147+but no wide characters are stored.
148+.
149+.PP
150+The sequence of bytes,
151+pointed to by
152+.IR *src ,
153+is interpreted as a multibyte character sequence
154+in the codeset which is associated with the
155+.B \%LC_CTYPE
156+category of the active process locale.
157+.
158+.PP
159+If
160+.I ps
161+is specified as a NULL pointer,
162+.BR \%mbsrtowcs ()
163+will track conversion state using an internal
164+.B \%mbstate_t
165+object reference,
166+which is private within the
167+.BR \%mbsrtowcs ()
168+process address space;
169+at process \%start\(hyup,
170+this internal
171+.B \%mbstate_t
172+object is initialized to represent
173+the initial conversion state.
174+.
175+.
176+.SH RETURN VALUE
177+On successful conversion of the multibyte character
178+sequence indirectly pointed to by
179+.IR *src ,
180+up to the wide character string length limit specified by
181+.IR len ,
182+.BR \%mbsrtowcs ()
183+updates
184+.IR *src ,
185+by either:
186+.RS 2n
187+.ll -2n
188+.IP \(bu 2n
189+Replacing it with a NULL pointer,
190+if conversion is terminated by a NUL character,
191+before
192+.I len
193+wide characters have been evaluated.
194+.
195+.IP \(bu 2n
196+Incrementing it,
197+such that it points to the first multibyte character in the
198+.I *src
199+sequence,
200+which,
201+when converted,
202+would produce wide characters beyond the string length
203+limit specified by
204+.IR len .
205+.ll +2n
206+.RE
207+.PP
208+In either case,
209+.BR mbsrtowcs ()
210+returns the actual number of
211+.B \%wchar_t
212+values which have been stored at
213+.IR dst ,
214+(if
215+.I dst
216+is not a NULL pointer,
217+or which would have been stored,
218+otherwise).
219+.
220+.
221+.SH ERROR CONDITIONS
222+If,
223+at any stage of conversion of the multibyte sequence at
224+.IR \%*src ,
225+and,
226+if
227+.I dst
228+is not a NULL pointer,
229+before
230+.I len
231+.B \%wchar_t
232+values have been evaluated,
233+any sequence within
234+.IR \%*src ,
235+which does not represent a valid multibyte character,
236+is encountered,
237+then
238+.I \%errno
239+is set to
240+.BR \%EILSEQ ,
241+and
242+.BR \%mbsrtowcs ()
243+returns
244+.IR \%(size_t)(\-1) ;
245+the conversion state,
246+including the state of any
247+.B \%wchar_t
248+values already stored at
249+.IR \%*dst ,
250+is undefined.
251+.
252+.
253+.SH STANDARDS CONFORMANCE
254+Except in respect of its provisions for handling of
255+.IR surrogate\ pairs ,
256+and to the extent that it may be affected by limitations
257+of the underlying \%MS\(hyWindows API,
258+the
259+.I \%libmingwex
260+implementation of
261+.BR mbsrtowcs ()
262+conforms generally to
263+.BR \%ISO\(hyC99 ,
264+.BR \%POSIX.1\(hy2001 ,
265+and
266+.BR \%POSIX.1\(hy2008 ;
267+(prior to \%mingwrt\-5.3,
268+and in those cases where calls may be delegated
269+to a Microsoft runtime DLL implementation,
270+this level of conformity may not be achieved).
271+.
272+.
273+.\"SH EXAMPLE
274+.
275+.
276+.SH CAVEATS AND BUGS
277+Due to a documented limitation of Microsoft\(aqs
278+.BR \%setlocale ()
279+function implementation,
280+it is not possible to directly select an active locale,
281+in which the codeset is represented by any multibyte
282+character sequence with an effective
283+.B \%MB_CUR_MAX
284+of more than two bytes.
285+Prior to
286+.IR \%mingwrt\(hy5.3 ,
287+this limitation precludes the use of
288+.BR \%mbsrtowcs ()
289+to interpret any codeset with
290+.B \%MB_CUR_MAX
291+greater than two bytes,
292+(such as
293+.BR \%UTF\(hy8 ).
294+From
295+.I \%mingwrt\(hy5.3
296+onward,
297+the MinGW.org implementation of
298+.BR \%mbsrtowcs ()
299+mitigates this limitation by assignment of the codeset
300+from the
301+.B \%LC_CTYPE
302+environment variable,
303+provided the system default has been previously activated
304+for the
305+.B \%LC_CTYPE
306+locale category;
307+e.g.\ execution of:
308+.PP
309+.RS 4n
310+.EX
311+#include <stdio.h>
312+#include <stdlib.h>
313+#include <locale.h>
314+#include <wchar.h>
315+
316+void print_conv( const char * );
317+
318+int main()
319+{
320+ setlocale( LC_CTYPE, "" );
321+ putenv( "LC_CTYPE=en_GB.65001" );
322+ print_conv( "\exe6\exb0\exb4\exf0\ex9d\ex84\ex8b" );
323+ return 0;
324+}
325+
326+void print_conv( const char *mbs )
327+{
328+ size_t len;
329+ if( (len = 1 + mbsrtowcs( NULL, &mbs, 0, NULL )) > 0 )
330+ {
331+ wchar_t wcs[len];
332+ len = mbsrtowcs( wch, &mbs, len, NULL );
333+ printf( "%d wide char%s: ", len, (len == 1) ? "" : "s" );
334+ while( len > 0 )
335+ { printf( "0x%04X%c", *wcs++, (--len > 0) : ':' : '\n' );
336+ }
337+ }
338+ else perror( "mbsrtowcs" );
339+}
340+.EE
341+.RE
342+.PP
343+will convert the
344+.B \%UTF\(hy8
345+encoded multibyte sequence,
346+\fC\%"\exe6\exb0\exb4\exf0\ex9d\ex84\ex8b"\fP,
347+(which represents the two Unicode code points,
348+\fC\%"\eu6c34"\fP and \fC\%\eU0001d10b")\fP,
349+to its equivalent
350+.B \%wchar_t
351+sequence,
352+resulting in the three\(hyvalue output sequence:
353+.PP
354+.RS 4n
355+.EX
356+3 wide chars: 0x6C34:0xD834:0xDD0B
357+.EE
358+.RE
359+.
360+.PP
361+Note that,
362+in the preceding example,
363+although the input
364+.B \%UTF\(hy8
365+sequence represents only
366+.I two
367+Unicode code points,
368+the output shows
369+.I \%three
370+distinct
371+.B \%wchar_t
372+values,
373+with the second code point being represented by the
374+.IR surrogate\ pair ,
375+\fC\%"0xD834:0xDD0B"\fP.
376+This raises a potential issue,
377+which is consequent on Microsoft\(aqs choice of
378+.B \%UTF-16LE
379+as the underlying representation of the
380+.B \%wchar_t
381+data type:
382+normally,
383+when
384+.I dst
385+is not a NULL pointer,
386+the MinGW
387+.BR mbsrtowcs ()
388+function will simply store a
389+.I surrogate\ pair
390+when necessary,
391+but in the particular case where doing so would cause the
392+.I low\ surrogate
393+to overrun the buffer length specified by the
394+.I len
395+argument,
396+then no part of the
397+.I surrogate\ pair
398+will be stored,
399+and
400+.BR mbsrtowcs ()
401+will stop as if the buffer length limit has been reached,
402+at a count of one less than
403+.IR len .
404+This case may be distinguished from a short count due to
405+conversion of a NUL character,
406+(in which case
407+.I *src
408+will have been respecified as a NULL pointer),
409+by inspection of
410+.IR *src ,
411+which will have been updated to point,
412+in this case,
413+to the start of that part of the multibyte sequence
414+which represents the
415+.IR surrogate\ pair .
416+.
417+.PP
418+A further issue,
419+also related to
420+.IR surrogate\ pairs ,
421+may arise if the
422+.B \%mbstate_t
423+object passed via the
424+.I *ps
425+argument originates from a preceding
426+.BR mbrtowc (3)
427+call which has returned a
428+.IR high\ surrogate ,
429+but the
430+.I low\ surrogate
431+has not been retrieved.
432+In this case,
433+the
434+.I low\ surrogate
435+is returned,
436+(and potentially orphaned),
437+as the first
438+.B \%wchar_t
439+value to be considered for storage at
440+.IR dst .
441+This may not be what you want,
442+but it is supported as an alternative to the method,
443+formally documented using
444+.BR mbrtowc (3),
445+for completion of a
446+.IR surrogate\ pair ;
447+for example:
448+.PP
449+.RS 4n
450+.EX
451+#define _ISOC99_SOURCE
452+
453+#include <stdio.h>
454+#include <stdlib.h>
455+#include <locale.h>
456+#include <limits.h>
457+#include <winnls.h>
458+#include <wchar.h>
459+
460+void print_conv( const char * );
461+
462+int main()
463+{
464+ setlocale( LC_CTYPE, "" );
465+ putenv( "LC_CTYPE=en_GB.65001" );
466+ print_conv( "\eU0001d10b" );
467+ print_conv( "\eu6c34" );
468+ return 0;
469+}
470+
471+void print_conv( const char *mbs )
472+{
473+ wchar_t wch;
474+ mbstate_t ps = (mbstate_t)(0);
475+ size_t n = mbrtowc( &wch, mbs, MB_LEN_MAX, &ps );
476+ if( (int)(n) > 0 )
477+ {
478+ if( IS_HIGH_SURROGATE( wch ) )
479+ {
480+ wchar_t wcl;
481+ mbsrtowcs( &wcl, &mbs, 1, &ps );
482+ printf( "%u bytes -> 0x%04X:0x%04X\en", n, wch, wcl );
483+ }
484+ else printf( "%u bytes -> 0x%04X\en", n, wch );
485+ }
486+ else if( n == (size_t)(-1) ) perror( "mbrtowc" );
487+}
488+.EE
489+.RE
490+.PP
491+is equivalent to the example given for
492+.I surrogate\ pair
493+completion using
494+.BR mbrtowc (3).
495+Regardless of the method used to complete
496+.IR surrogate\ pairs ,
497+it is the caller\(aqs responsibility to ensure that the
498+.I high\ surrogate
499+and its complementary
500+.I low\ surrogate
501+remain correctly associated.
502+.
503+.
504+.SH SEE ALSO
505+.BR mbsinit (3),
506+and
507+.BR mbrtowc (3)
508+.
509+.
510+.SH AUTHOR
511+This manpage was written by \%Keith\ Marshall,
512+\%<keith@users.osdn.me>,
513+to document the
514+.BR \%mbsrtowcs ()
515+function as it has been implemented for the MinGW.org Project.
516+It may be copied, modified and redistributed,
517+without restriction of copyright,
518+provided this acknowledgement of contribution by
519+the original author remains in place.
520+.
521+.\" EOF
--- /dev/null
+++ b/mingwrt/man/wcrtomb.3.man
@@ -0,0 +1,493 @@
1+.\" vim: ft=nroff
2+.TH %PAGEREF% MinGW "MinGW Programmer's Reference Manual"
3+.
4+.SH NAME
5+.B \%wcrtomb
6+\- convert a wide character to a multibyte sequence
7+.
8+.
9+.SH SYNOPSIS
10+.B #include
11+.RB < wchar.h >
12+.PP
13+.B size_t wcrtomb( char
14+.BI * s ,
15+.B wchar_t
16+.IB wc ,
17+.B mbstate_t
18+.BI * ps
19+.B );
20+.
21+.IP \& -4n
22+Feature Test Macro Requirements for libmingwex:
23+.PP
24+.BR \%__MSVCRT_VERSION__ :
25+since \%mingwrt\(hy5.3,
26+if this feature test macro is
27+.IR defined ,
28+with a value of
29+.I at least
30+.IR \%0x0800 ,
31+(corresponding to the symbolic constant,
32+.BR \%__MSCVR80_DLL ,
33+and thus declaring intent to link with \%MSVCR80.DLL,
34+or any later version of \%Microsoft\(aqs \%non\(hyfree runtime library,
35+instead of with \%MSVCRT.DLL),
36+calls to
37+.BR \%wcrtomb ()
38+will be directed to the implementation thereof,
39+within \%Microsoft\(aqs runtime DLL.
40+.
41+.PP
42+.BR \%_ISOC99_SOURCE ,
43+.BR \%_ISOC11_SOURCE :
44+since \%mingwrt\(hy5.3.1,
45+when linking with \%MSVCRT.DLL,
46+or when
47+.B \%__MSVCRT_VERSION__
48+is either
49+.IR undefined ,
50+or is
51+.I defined
52+with any value which is
53+.I less than
54+.IR \%0x0800 ,
55+(thus denying intent to link with \%MSVCR80.DLL,
56+or any later \%non\(hyfree version of Microsoft\(aqs runtime library),
57+.I explicitly
58+defining either of these feature test macros
59+will cause any call to
60+.BR \%wcrtomb ()
61+to be directed to the
62+.I \%libmingwex
63+implementation;
64+if neither macro is defined,
65+calls to
66+.BR \%wcrtomb ()
67+will be directed to Microsoft\(aqs runtime implementation,
68+if it is available,
69+otherwise falling back to the
70+.I \%libmingwex
71+implementation.
72+.
73+.PP
74+Prior to \%mingwrt\(hy5.3,
75+none of the above feature test macros have any effect on
76+.BR \%wcrtomb ();
77+all calls will be directed to the
78+.I \%libmingwex
79+implementation.
80+.
81+.
82+.SH DESCRIPTION
83+The
84+.BR \%wcrtomb ()
85+function determines the number of bytes which are required,
86+starting from the conversion state represented by the
87+.B \%mbstate_t
88+object at
89+.IR *ps ,
90+to accommodate the multibyte character sequence,
91+in the codeset associated with the
92+.B \%LC_CTYPE
93+category of the active process locale,
94+which represents the completed conversion of
95+the wide character specified by
96+.IR wc .
97+.
98+.PP
99+In the special case,
100+when
101+.I s
102+is a NULL pointer,
103+the
104+.I wc
105+argument is ignored,
106+and the call is evaluated as if it had been invoked as
107+.PP
108+.RS 4n
109+.EX
110+wcrtomb( buf, L'\e0', ps )
111+.EE
112+.RE
113+.PP
114+returning the effect of conversion of the NUL wide character,
115+as a completion of any intermediate conversion state specified in
116+.IR *ps ,
117+but without storing the converted multibyte sequence;
118+(in this special case,
119+the
120+.B \%ISO\(hyC99
121+standard specifies that
122+.I buf
123+should be an internal buffer,
124+but since such a buffer becomes effectively inaccessible,
125+storage of any converted multibyte sequence is unnecessary).
126+.
127+.PP
128+Conversely,
129+in the normal case,
130+when
131+.I s
132+is not a NULL pointer,
133+the
134+.BR \%wcrtomb ()
135+function converts the wide character,
136+represented by
137+.IR wc ,
138+to the corresponding multibyte character sequence,
139+which is stored in the byte array starting at
140+.IR *s ,
141+and the function return value is set to
142+the number of bytes stored.
143+.
144+.
145+.SH RETURN VALUE
146+When conversion is successful,
147+regardless of whether the resultant multibyte sequence is stored,
148+or not,
149+the
150+.BR wcrtomb ()
151+function returns the number of bytes which are,
152+or which would be,
153+stored at
154+.IR *s .
155+.
156+.PP
157+If the result of conversion represents a completed multibyte sequence,
158+the conversion state,
159+represented by
160+.IR *ps ,
161+is updated to represent the
162+.I initial
163+.IR state .
164+Conversely,
165+if the result of conversion is equivalent to the conversion of a
166+.I high
167+.IR surrogate ,
168+nothing is stored,
169+the return value is set to
170+.IR zero ,
171+and the conversion state is updated to represent a pending
172+.I surrogate pair
173+completion.
174+.
175+.
176+.SH ERROR CONDITIONS
177+If the wide character,
178+passed as
179+.IR wc ,
180+either cannot be converted to a valid multibyte sequence,
181+or does not complete a pending
182+.I surrogate pair
183+which can be represented as a valid multibyte sequence,
184+in the codeset of the active
185+.B \%LC_CTYPE
186+locale category,
187+.I \%errno
188+is set to
189+.BR \%EILSEQ ,
190+the
191+.BR wcrtomb ()
192+function returns
193+.IR (size_t)(\-1) ,
194+and the conversion state is unspecified.
195+.
196+.
197+.SH STANDARDS CONFORMANCE
198+Except in respect of its extended provision for handling of
199+.IR surrogate\ pairs ,
200+and to the extent that it may be affected by limitations
201+of the underlying \%MS\(hyWindows API,
202+the
203+.I \%libmingwex
204+implementation of
205+.BR \%wcrtomb ()
206+conforms generally to
207+.BR \%ISO\(hyC99 ,
208+.BR \%POSIX.1\(hy2001 ,
209+and
210+.BR \%POSIX.1\(hy2008 ;
211+(prior to \%mingwrt\-5.3,
212+and in those cases where calls may be delegated
213+to a Microsoft runtime DLL implementation,
214+this level of conformity may not be achieved).
215+.
216+.
217+.\"SH EXAMPLE
218+.
219+.
220+.SH CAVEATS AND BUGS
221+Due to a documented limitation of Microsoft\(aqs
222+.BR \%setlocale ()
223+function implementation,
224+it is not possible to directly select an active locale,
225+in which the codeset is represented by any multibyte
226+character sequence with an effective
227+.B \%MB_CUR_MAX
228+of more than two bytes.
229+Prior to \%mingwrt\(hy5.3,
230+this limitation precludes the use of
231+.BR \%wcrtomb ()
232+to interpret any codeset with
233+.B \%MB_CUR_MAX
234+greater than two bytes,
235+(such as
236+.BR \%UTF\(hy8 ).
237+From \%mingwrt\(hy5.3 onward,
238+the MinGW.org implementation of
239+.BR \%wcrtomb ()
240+mitigates this limitation by assignment of the codeset
241+from the
242+.B \%LC_CTYPE
243+environment variable,
244+provided the system default has been previously activated
245+for the
246+.B \%LC_CTYPE
247+locale category;
248+e.g.\ execution of:
249+.PP
250+.RS 4n
251+.EX
252+#define _ISOC99_SOURCE
253+
254+#include <stdio.h>
255+#include <stdlib.h>
256+#include <locale.h>
257+#include <limits.h>
258+#include <wchar.h>
259+
260+void print_conv( const wchar_t * );
261+
262+int main()
263+{
264+ setlocale( LC_CTYPE, "" );
265+ putenv( "LC_CTYPE=en_GB.65001" );
266+ print_conv( L"\eu6c34\eU0001d10b" );
267+ return 0;
268+}
269+
270+void print_conv( const wchar_t *wcs )
271+{
272+ wchar_t wch;
273+ while( (wch = *wcs++) != L'\e0' )
274+ {
275+ char mbs[MB_LEN_MAX];
276+ mbstate_t ps = (mbstate_t)(0);
277+ size_t n = wcrtomb( mbs, wch, &ps );
278+
279+ if( (int)(n) > 0 )
280+ {
281+ unsigned char *p = (unsigned char *)(mbs);
282+ printf( "Single wide character: 0x%04X \-\-> %u byte%s",
283+ wch, n, (n == 1) ? ": " : "s: "
284+ );
285+ while( n > 0 )
286+ printf( "0x%02X%c", *p++, (\-\-n == 0) ? '\en' : ':' );
287+ }
288+ else if( n == (size_t)(\-1) ) perror( "wcrtomb" );
289+ }
290+}
291+.EE
292+.RE
293+.PP
294+will successfully convert the \fCL"\eu6c34"\fP wide character to its
295+.B \%UTF\(hy8
296+equivalent,
297+resulting in the output:
298+.PP
299+.RS 4n
300+.EX
301+Single wide character: 0x6C34 \-\-> 3 bytes: 0xE6:0xB0:0xB4
302+.EE
303+.RE
304+.PP
305+However,
306+when it then progresses to the \fCL"\eU0001d10b"\fP wide character,
307+(which
308+.I should
309+be represented by a valid
310+.B \%UTF\(hy16LE
311+.I surrogate
312+.IR pair ),
313+it fails with the diagnostic:
314+.PP
315+.RS 4n
316+.EX
317+wcrtomb: Invalid or incomplete multibyte or wide character
318+.EE
319+.RE
320+.
321+.PP
322+This (possibly unexpected) failure is an unfortunate consequence
323+of Microsoft\(aqs choice of
324+.B \%UTF\(hy16LE
325+as the underlying representation of the
326+.B \%wchar_t
327+data type;
328+this choice makes it impossible for
329+.I any
330+\%MS\(hyWindows implementation of
331+.BR \%wcrtomb ()
332+to be fully
333+.B \%ISO\(hyC99
334+compliant.
335+To mitigate this non\(hycompliance,
336+the MinGW implementation of
337+.BR \%wcrtomb ()
338+incorporates the following non\(hystandard capabilities:
339+.RS 2n
340+.ll -2n
341+.IP \(bu 2n
342+When the
343+.B \%mbstate_t
344+argument refers to the
345+.I initial conversion
346+.IR state ,
347+and the
348+.B \%wchar_t
349+argument represents a
350+.I high
351+.IR surrogate ,
352+then nothing is stored in the conversion buffer,
353+the
354+.B \%mbstate_t
355+reference is updated to indicate pending completion of the
356+.IR surrogate ,
357+and the function returns an effective conversion count of
358+.I zero
359+bytes.
360+.
361+.IP \(bu 2n
362+When the
363+.B \%mbstate_t
364+argument refers to a pending completion of a
365+.I surrogate
366+.IR pair ,
367+and the
368+.B \%wchar_t
369+argument represents a
370+.I low
371+.IR surrogate ,
372+then the deferred
373+.I high surrogate
374+is combined with the
375+.I low surrogate
376+argument,
377+and the two are converted as a pair;
378+the resultant conversion is stored in the conversion buffer,
379+the
380+.B \%mbstate_t
381+reference is reset to the
382+.I initial conversion
383+.IR state ,
384+and the function returns the number of bytes
385+which were stored in the conversion buffer.
386+.ll +2n
387+.RE
388+.
389+.PP
390+These capabilities of MinGW\(aqs
391+.BR \%wcrtomb ()
392+are certainly non\(hystandard;
393+nonetheless,
394+they are required to circumvent non\(hyconformity,
395+which is imposed by an unfortunate Microsoft design choice,
396+and it is incumbent upon the caller of
397+.BR \%wcrtomb (),
398+on the \%MS\(hyWindows platform,
399+to make use of them.
400+The preceding example clearly illustrates how strictly
401+.B \%ISO\(hyC99
402+conforming usage will yield incorrect behaviour;
403+the following illustrates how that example may be adapted,
404+by incorporation of the above non\(hystandard features,
405+to achieve correct behaviour:
406+.PP
407+.RS 4n
408+.EX
409+#define _ISOC99_SOURCE
410+
411+#include <stdio.h>
412+#include <stdlib.h>
413+#include <locale.h>
414+#include <limits.h>
415+#include <winnls.h>
416+#include <wchar.h>
417+
418+void print_conv( const wchar_t * );
419+
420+int main()
421+{
422+ setlocale( LC_CTYPE, "" );
423+ putenv( "LC_CTYPE=en_GB.65001" );
424+ print_conv( L"\eu6c34\eU0001d10b" );
425+ return 0;
426+}
427+
428+#define DESC(FMT) FMT "0x%1$04X --> %2$u byte%3$s"
429+
430+void print_conv( const wchar_t *wcs )
431+{
432+ while( *wcs != L'\e0' )
433+ {
434+ wchar_t wch = *wcs;
435+ char mbs[MB_LEN_MAX];
436+ mbstate_t ps = (mbstate_t)(0);
437+ const char *fmt = DESC( "Single wide character: " );
438+ size_t n = wcrtomb( mbs, wch, &ps );
439+
440+ if( (n == (size_t)(0)) && IS_HIGH_SURROGATE( wch ) )
441+ {
442+ if( (int)(n = wcrtomb( mbs, wcs[1], &ps )) > 0 )
443+ {
444+ fmt = DESC( "Surrogate pair: 0x%1$04X:" );
445+ wcs++;
446+ }
447+ }
448+ if( (int)(n) > 0 )
449+ {
450+ unsigned char *p = (unsigned char *)(mbs);
451+ printf( fmt, wch, n, (n == 1) ? ": " : "s: ", *wcs );
452+ while( n > 0 )
453+ printf( "0x%02X%c", *p++, (\-\-n == 0) ? '\en' : ':' );
454+ }
455+ else if( n == (size_t)(\-1) ) perror( "wcrtomb" );
456+ if( *wcs != L'\e0' ) ++wcs;
457+ }
458+}
459+.EE
460+.RE
461+.PP
462+It may be observed that,
463+on execution of this modified version of the example,
464+both the \fCL"\eu6c34"\fP,
465+and the \fCL"\eU0001d10b"\fP code points are now correctly evaluated,
466+producing the expected output:
467+.PP
468+.RS 2n
469+.EX
470+Single wide character: 0x6C34 --> 3 bytes: 0xE6:0xB0:0xB4
471+Surrogate pair: 0xD834:0xD834 --> 4 bytes: 0xF0:0x9D:0x84:0x8B
472+.EE
473+.RE
474+.
475+.
476+.SH SEE ALSO
477+.BR mbsinit (3),
478+and
479+.BR wcsrtombs (3)
480+.
481+.
482+.SH AUTHOR
483+This manpage was written by \%Keith\ Marshall,
484+\%<keith@users.osdn.me>,
485+to document the
486+.BR \%wcrtomb ()
487+function as it has been implemented for the MinGW.org Project.
488+It may be copied, modified and redistributed,
489+without restriction of copyright,
490+provided this acknowledgement of contribution by
491+the original author remains in place.
492+.
493+.\" EOF
--- /dev/null
+++ b/mingwrt/man/wcsrtombs.3.man
@@ -0,0 +1,361 @@
1+.\" vim: ft=nroff
2+.TH %PAGEREF% MinGW "MinGW Programmer's Reference Manual"
3+.
4+.SH NAME
5+.B \%wcsrtombs
6+\- convert a wide character to a multibyte sequence
7+.
8+.
9+.SH SYNOPSIS
10+.B #include
11+.RB < wchar.h >
12+.PP
13+.B size_t wcsrtombs( char
14+.BI * dst ,
15+.B wchar_t
16+.BI ** src ,
17+.B size_t
18+.IB len ,
19+.B mbstate_t
20+.BI * ps
21+.B );
22+.
23+.IP \& -4n
24+Feature Test Macro Requirements for libmingwex:
25+.PP
26+.BR \%__MSVCRT_VERSION__ :
27+since \%mingwrt\(hy5.3,
28+if this feature test macro is
29+.IR defined ,
30+with a value of
31+.I at least
32+.IR \%0x0800 ,
33+(corresponding to the symbolic constant,
34+.BR \%__MSCVR80_DLL ,
35+and thus declaring intent to link with \%MSVCR80.DLL,
36+or any later version of \%Microsoft\(aqs \%non\(hyfree runtime library,
37+instead of with \%MSVCRT.DLL),
38+calls to
39+.BR \%wcsrtombs ()
40+will be directed to the implementation thereof,
41+within \%Microsoft\(aqs runtime DLL.
42+.
43+.PP
44+.BR \%_ISOC99_SOURCE ,
45+.BR \%_ISOC11_SOURCE :
46+since \%mingwrt\(hy5.3.1,
47+when linking with \%MSVCRT.DLL,
48+or when
49+.B \%__MSVCRT_VERSION__
50+is either
51+.IR undefined ,
52+or is
53+.I defined
54+with any value which is
55+.I less than
56+.IR \%0x0800 ,
57+(thus denying intent to link with \%MSVCR80.DLL,
58+or any later \%non\(hyfree version of Microsoft\(aqs runtime library),
59+.I explicitly
60+defining either of these feature test macros
61+will cause any call to
62+.BR \%wcsrtombs ()
63+to be directed to the
64+.I \%libmingwex
65+implementation;
66+if neither macro is defined,
67+calls to
68+.BR \%wcsrtombs ()
69+will be directed to Microsoft\(aqs runtime implementation,
70+if it is available,
71+otherwise falling back to the
72+.I \%libmingwex
73+implementation.
74+.
75+.PP
76+Prior to \%mingwrt\(hy5.3,
77+none of the above feature test macros have any effect on
78+.BR \%wcsrtombs ();
79+all calls will be directed to the
80+.I \%libmingwex
81+implementation.
82+.
83+.
84+.SH DESCRIPTION
85+The
86+.BR \%wcsrtombs ()
87+function converts a sequence of wide characters from
88+the array which is indirectly pointed to by
89+.IR src ,
90+to a corresponding multibyte character sequence in
91+the codeset which is associated with the
92+.B \%LC_CTYPE
93+category of the active process locale,
94+beginning in the conversion state which is represented by the
95+.B \%mbstate_t
96+object at
97+.IR *ps ;
98+each wide character is converted,
99+as if by calling the
100+.BR \%wcrtomb (3)
101+function.
102+.
103+.PP
104+Conversion continues until:
105+.RS 2n
106+.ll -2n
107+.IP \(bu 2n
108+A wide character which is invalid in its own context is encountered.
109+.
110+.IP \(bu 2n
111+A wide character which does not have a valid representation within
112+the target multibyte codeset is encountered.
113+.
114+.IP \(bu 2n
115+The NUL wide character is encountered,
116+while in the initial conversion state.
117+.
118+.IP \(bu 2n
119+The
120+.I dst
121+argument is not a NULL pointer,
122+and a wide character is encountered for which
123+the converted length would cause the aggregate length
124+of the converted multibyte character string to exceed
125+the limit specified by the
126+.I len
127+argument.
128+.ll +2n
129+.RE
130+.
131+.PP
132+If
133+.I dst
134+is
135+.I not
136+a NULL pointer,
137+the multibyte character string resulting from successful conversion,
138+up to a maximum of
139+.I len
140+bytes,
141+is stored in the multibyte array starting at
142+.IR dst .
143+If the conversion is NUL terminated,
144+the wide character string reference pointed to by
145+.I src
146+is replaced by a NULL pointer;
147+otherwise it is updated to point to the address immediately
148+following that of the last wide character converted.
149+.
150+.PP
151+If
152+.I dst
153+is a NULL pointer,
154+the aggregate count of bytes required
155+to represent the conversion is accumulated,
156+until any one of the preceding termination conditions is encountered;
157+the
158+.I len
159+argument,
160+and the termination condition which is dependent upon it,
161+is ignored,
162+and the conversion is not stored.
163+.
164+.PP
165+If
166+.I ps
167+is a NULL pointer,
168+the
169+.BR \%wcsrtombs ()
170+function uses a static internal
171+.B \%mbstate_t
172+object,
173+which is known only to,
174+and visible only within the scope of execution of,
175+the
176+.BR \%wcsrtombs ()
177+function itself.
178+.
179+.PP
180+Following a successful conversion,
181+the
182+.B \%mbstate_t
183+object at
184+.IR *ps ,
185+or the internal
186+.B \%mbstate_t
187+object if appropriate,
188+is reset to the initial conversion state.
189+.
190+.
191+.SH RETURN VALUE
192+When conversion is successful,
193+and
194+.I dst
195+is
196+.I not
197+a NULL pointer,
198+the
199+.BR \%wcsrtombs ()
200+function returns the number of bytes stored at
201+.IR dst ,
202+to represent the resulting multibyte character sequence,
203+.I excluding
204+the terminating NUL,
205+(if any).
206+.
207+.PP
208+Conversely,
209+when conversion is successful,
210+but
211+.I dst is
212+a NULL pointer,
213+the
214+.BR \%wcsrtombs ()
215+function returns the number of bytes which would be required
216+to store the entire multibyte character string resulting from
217+the successful conversion,
218+.I excluding
219+the terminating NUL.
220+.
221+.
222+.SH ERROR CONDITIONS
223+If conversion is unsuccessful,
224+.I \%errno
225+is set to
226+.BR \%EILSEQ ,
227+the
228+.BR wcsrtombs ()
229+function returns
230+.IR (size_t)(\-1) ,
231+and the conversion state is unspecified.
232+.
233+.
234+.SH STANDARDS CONFORMANCE
235+Except in respect of its extended provision for handling of
236+.IR surrogate\ pairs ,
237+and to the extent that it may be affected by limitations
238+of the underlying \%MS\(hyWindows API,
239+the
240+.I \%libmingwex
241+implementation of
242+.BR \%wcsrtombs ()
243+conforms generally to
244+.BR \%ISO\(hyC99 ,
245+.BR \%POSIX.1\(hy2001 ,
246+and
247+.BR \%POSIX.1\(hy2008 ;
248+(prior to \%mingwrt\-5.3,
249+and in those cases where calls may be delegated
250+to a Microsoft runtime DLL implementation,
251+this level of conformity may not be achieved).
252+.
253+.
254+.\"SH EXAMPLE
255+.
256+.
257+.SH CAVEATS AND BUGS
258+Due to a documented limitation of Microsoft\(aqs
259+.BR \%setlocale ()
260+function implementation,
261+it is not possible to directly select an active locale,
262+in which the codeset is represented by any multibyte
263+character sequence with an effective
264+.B \%MB_CUR_MAX
265+of more than two bytes.
266+Prior to \%mingwrt\(hy5.3,
267+this limitation precludes the use of
268+.BR \%wcsrtombs ()
269+to convert to any codeset with
270+.B \%MB_CUR_MAX
271+greater than two bytes,
272+(such as
273+.BR \%UTF\(hy8 ).
274+From \%mingwrt\(hy5.3 onward,
275+the MinGW.org implementation of
276+.BR \%wcsrtombs ()
277+mitigates this limitation by assignment of the codeset
278+from the
279+.B \%LC_CTYPE
280+environment variable,
281+provided the system default has been previously activated
282+for the
283+.B \%LC_CTYPE
284+locale category;
285+e.g.\ execution of:
286+.PP
287+.RS 4n
288+.EX
289+#define _ISOC99_SOURCE
290+
291+#include <stdio.h>
292+#include <stdlib.h>
293+#include <locale.h>
294+#include <wchar.h>
295+
296+void print_conv( const wchar_t * );
297+
298+int main()
299+{
300+ setlocale( LC_CTYPE, "" );
301+ putenv( "LC_CTYPE=en_GB.65001" );
302+ print_conv( L"\eu6c34\eU0001d10b" );
303+ return 0;
304+}
305+
306+void print_conv( const wchar_t *wcs )
307+{
308+ size_t len;
309+ if( (len = 1 + wcsrtombs( NULL, &wcs, 0, NULL )) > 0 )
310+ {
311+ const wchar_t *wc = wcs;
312+ size_t n = 1 + wcslen( wcs );
313+ unsigned char mbs[len], *mb = mbs;
314+ printf( "UTF-16: %u value%s: ", n, (n == 1) ? "" : "s" );
315+ do { printf( "0x%04X%c", *wc, (*wc == L'\e0') ? '\en' : ':' );
316+ } while( *p++ != L'\e0' );
317+ printf( "UTF-8: %u byte%s: ",
318+ 1 + wcsrtombs( mbs, &wcs, len, NULL ),
319+ (len == 1) ? "" : "s"
320+ );
321+ do { printf( "0x%02X%s", *mb, (*mb == '\e0') ? '\en' : ':' );
322+ } while( *mb++ != '\e0' );
323+ }
324+ else perror( "wcsrtombs" );
325+}
326+.EE
327+.RE
328+.PP
329+will select
330+.B \%UTF\(hy8
331+as the target codeset,
332+then convert the \fC\%L"\eu6c34\eU0001d10b"\fP
333+wide character string,
334+resulting in the output:
335+.PP
336+.RS 4n
337+.EX
338+UTF-16: 4 values: 0x6C34:0xD834:0xDD0B:0x0000
339+UTF-8: 8 bytes: 0xE6:0xB0:0xB4:0xF0:0x9D:0x84:0x8B:0x00
340+.EE
341+.RE
342+.
343+.
344+.SH SEE ALSO
345+.BR mbsinit (3),
346+and
347+.BR wcrtomb (3)
348+.
349+.
350+.SH AUTHOR
351+This manpage was written by \%Keith\ Marshall,
352+\%<keith@users.osdn.me>,
353+to document the
354+.BR \%wcsrtombs ()
355+function as it has been implemented for the MinGW.org Project.
356+It may be copied, modified and redistributed,
357+without restriction of copyright,
358+provided this acknowledgement of contribution by
359+the original author remains in place.
360+.
361+.\" EOF
--- /dev/null
+++ b/mingwrt/man/wctob.3.man
@@ -0,0 +1,174 @@
1+.\" vim: ft=nroff
2+.TH %PAGEREF% MinGW "MinGW Programmer's Reference Manual"
3+.
4+.SH NAME
5+.B \%wctob
6+\- convert a wide character to a single byte
7+.
8+.
9+.SH SYNOPSIS
10+.B #include
11+.RB < stdio.h >
12+.br
13+.B #include
14+.RB < wchar.h >
15+.PP
16+.B int wctob( wint_t
17+.I c
18+.B );
19+.
20+.IP \& -4n
21+Feature Test Macro Requirements for libmingwex:
22+.PP
23+.BR \%__MSVCRT_VERSION__ :
24+since \%mingwrt\(hy5.3,
25+if this feature test macro is
26+.IR defined ,
27+with a value of
28+.I at least
29+.IR \%0x0800 ,
30+(corresponding to the symbolic constant,
31+.BR \%__MSCVR80_DLL ,
32+and thus declaring intent to link with \%MSVCR80.DLL,
33+or any later version of \%Microsoft\(aqs \%non\(hyfree runtime library,
34+instead of with \%MSVCRT.DLL),
35+calls to
36+.BR \%wctob ()
37+will be directed to the implementation thereof,
38+within \%Microsoft\(aqs runtime DLL.
39+.
40+.PP
41+.BR \%_ISOC99_SOURCE ,
42+.BR \%_ISOC11_SOURCE :
43+since \%mingwrt\(hy5.3.1,
44+when linking with \%MSVCRT.DLL,
45+or when
46+.B \%__MSVCRT_VERSION__
47+is either
48+.IR undefined ,
49+or is
50+.I defined
51+with any value which is
52+.I less than
53+.IR \%0x0800 ,
54+(thus denying intent to link with \%MSVCR80.DLL,
55+or any later \%non\(hyfree version of Microsoft\(aqs runtime library),
56+.I explicitly
57+defining either of these feature test macros
58+will cause any call to
59+.BR \%wctob ()
60+to be directed to the
61+.I \%libmingwex
62+implementation;
63+if neither macro is defined,
64+calls to
65+.BR \%wctob ()
66+will be directed to Microsoft\(aqs runtime implementation,
67+if it is available,
68+otherwise falling back to the
69+.I \%libmingwex
70+implementation.
71+.
72+.PP
73+Prior to \%mingwrt\(hy5.3,
74+none of the above feature test macros have any effect on
75+.BR \%wctob ();
76+all calls will be directed to the
77+.I \%libmingwex
78+implementation.
79+.
80+.
81+.SH DESCRIPTION
82+The
83+.BR \%wctob ()
84+function converts the wide character,
85+represented by
86+.IR c ,
87+to a multibyte character sequence
88+in the codeset which is associated with the
89+.B \%LC_CTYPE
90+category of the active process locale.
91+Provided the entire conversion can be accommodated
92+within a single byte,
93+the value of that byte,
94+interpreted as an
95+.IR unsigned\ char ,
96+and cast to an
97+.IR int ,
98+is returned;
99+otherwise,
100+.B EOF
101+is returned.
102+.
103+.
104+.SH RETURN VALUE
105+If the conversion of
106+.IR c ,
107+to a multibyte character sequence,
108+in its entirety,
109+occupies exactly
110+.I one
111+byte,
112+the value of that byte,
113+interpreted as an
114+.IR unsigned\ char ,
115+and cast to an
116+.IR int ,
117+is returned;
118+otherwise,
119+.B EOF
120+is returned.
121+.
122+.
123+.SH ERROR CONDITIONS
124+No error conditions are defined.
125+.
126+.
127+.SH STANDARDS CONFORMANCE
128+Except to the extent that it may be affected by limitations
129+of the underlying \%MS\(hyWindows API,
130+the
131+.I \%libmingwex
132+implementation of
133+.BR \%wctob ()
134+conforms generally to
135+.BR \%ISO\(hyC99 ,
136+.BR \%POSIX.1\(hy2001 ,
137+and
138+.BR \%POSIX.1\(hy2008 ;
139+(prior to \%mingwrt\(hy5.3,
140+and in those cases where calls may be delegated
141+to a Microsoft runtime DLL implementation,
142+this level of conformity may not be achieved).
143+.
144+.
145+.\"SH EXAMPLE
146+.
147+.
148+.SH CAVEATS AND BUGS
149+Use of the
150+.BR \%wctob ()
151+function is
152+.IR discouraged ;
153+it serves no purpose which may not be better served by the
154+.BR \%wcrtomb (3)
155+function,
156+which should be considered as a preferred alternative.
157+.
158+.
159+.SH SEE ALSO
160+.BR wcrtomb (3)
161+.
162+.
163+.SH AUTHOR
164+This manpage was written by \%Keith\ Marshall,
165+\%<keith@users.osdn.me>,
166+to document the
167+.BR \%wctob ()
168+function as it has been implemented for the MinGW.org Project.
169+It may be copied, modified and redistributed,
170+without restriction of copyright,
171+provided this acknowledgement of contribution by
172+the original author remains in place.
173+.
174+.\" EOF
Show on old repository browser