Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TIdHTTPResponseInfo.WriteContent improperly uses IOHandler.Write method #499

Open
Molochnik opened this issue Sep 10, 2023 · 4 comments
Open
Labels
Element: HTTP Issues related to HTTP handling, TIdHTTP, TIdHTTPServer, TIdHTTPProxyServer, etc Status: Reported Issue has been reported for review Type: Bug Issue is a bug in existing code

Comments

@Molochnik
Copy link

Molochnik commented Sep 10, 2023

TIdHTTPResponseInfo.WriteContent uses IOHandler.Write in two places and in both of them incorrectly. It causes wrong message encoding in Lazarus under Linux (unit IdCustomHTTPServer)
Now it is (line 2188) :

FConnection.IOHandler.Write(ContentText, CharsetToEncoding(CharSet));

Should be:

FConnection.IOHandler.Write(ContentText, CharsetToEncoding(CharSet){$IFDEF STRING_IS_ANSI}, CharsetToEncoding(CharSet){$ENDIF});

Now it is (line 2213) :

FConnection.IOHandler.Write('<HTML><BODY><B>' + IntToStr(ResponseNo) + ' ' + ResponseText    {Do not Localize}
       + '</B></BODY></HTML>', CharsetToEncoding(CharSet));    {Do not Localize};

Should be:

FConnection.IOHandler.Write('<HTML><BODY><B>' + IntToStr(ResponseNo) + ' ' + ResponseText    {Do not Localize}
       + '</B></BODY></HTML>', CharsetToEncoding(CharSet){$IFDEF STRING_IS_ANSI}, CharsetToEncoding(CharSet){$ENDIF});    {Do not Localize};
@Molochnik Molochnik added Status: Reported Issue has been reported for review Type: Bug Issue is a bug in existing code labels Sep 10, 2023
@rlebeau
Copy link
Member

rlebeau commented Sep 11, 2023

This is not really a bug, IMHO.

In non-Unicode builds, IOHandler.Write() takes in an AnsiString, where the optional ASrcEncoding parameter specifies the encoding of that AnsiString. By default, Indy expects that AnsiString to use the OS's system encoding, which is how Delphi and FreePascal normally handle AnsiString.

The AnsiString data will be converted to Unicode using the ASrcEncoding, and then that Unicode will be converted to the AByteEncoding for transmission.

The TIdHTTPResponseInfo.Charset property is meant to specify the output encoding on the wire, not the input encoding of the AnsiString. This allows users to send, for example, a Latin-1 input string as UTF-8 output without having to encode it themselves, keeping the AnsiString is the compiler's native format.

If you want to send a ContentText string using an AnsiString with a different encoding than the OS, you should set the IOHandler.DefAnsiEncoding property, which the IOHandler will use when no ASrcEncoding is specified. For instance, if you have FreePascal/Lazarus setup to use UTF-8 encoded AnsiStrings, you can set the IOHandler.DefAnsiEncoding to IndyTextEncoding_UTF8, such as in the server's OnConnect event.

@rlebeau rlebeau added the Element: HTTP Issues related to HTTP handling, TIdHTTP, TIdHTTPServer, TIdHTTPProxyServer, etc label Sep 11, 2023
@Molochnik
Copy link
Author

Molochnik commented Sep 11, 2023

Thank you very much! That helped. Some notes:

  1. FConnection is protected in TIdHTTPResponseInfo and I had to write a helper:
procedure TIdHTTPResponseInfoHelper.WriteContentUni;
begin
{$IFDEF STRING_IS_ANSI}
  FConnection.IOHandler.DefAnsiEncoding := IndyTextEncoding_UTF8;
{$ENDIF}
  WriteContent;
end;
  1. In Lazarus under Linux I have both STRING_IS_UNICODE and STRING_IS_ANSI undefined. How is that possible?

@rlebeau
Copy link
Member

rlebeau commented Sep 12, 2023

  1. FConnection is protected in TIdHTTPResponseInfo and I had to write a helper:

Why not simply set the DefAnsiEncoding in the OnConnect event, like I suggested?

procedure TMyForm.IdHTTPServer1Connect(AContext: TIdContext);
begin
{$IFDEF STRING_IS_ANSI}
  AContext.Connection.IOHandler.DefAnsiEncoding := IndyTextEncoding_UTF8;
{$ENDIF}
end;

If you are going to write a custom wrapper, I would suggest allowing the caller configure the encoding as needed, eg:

procedure TIdHTTPResponseInfoHelper.WriteContentUni(
  {$IFDEF STRING_IS_ANSI}ASrcEncoding: IIdTextEncoding = nil{$ENDIF}
);
begin
{$IFDEF STRING_IS_ANSI}
  EnsureEncoding(ASrcEncoding, encUTF8);
  FConnection.IOHandler.DefAnsiEncoding := ASrcEncoding;
{$ENDIF}
  WriteContent;
end;
  1. In Lazarus under Linux I have both STRING_IS_UNICODE and STRING_IS_ANSI undefined. How is that possible?

Indy defines the STRING_IS_... conditionals in its IdCompilerDefines.inc file, did you {$I} that file into your code?

Otherwise, Indy uses FreePascal's own FPC_UNICODESTRINGS conditional to determine when to define STRING_IS_UNICODE, so you could use that same conditional if you don't need compatibility with Delphi, eg:

procedure TMyForm.IdHTTPServer1Connect(AContext: TIdContext);
begin
{$IFNDEF FPC_UNICODESTRINGS}
  AContext.Connection.IOHandler.DefAnsiEncoding := IndyTextEncoding_UTF8;
{$ENDIF}
end;

Or:

procedure TIdHTTPResponseInfoHelper.WriteContentUni(
  {$IFNDEF FPC_UNICODESTRING}ASrcEncoding: IIdTextEncoding = nil{$ENDIF}
);
begin
{$IFNDEF FPC_UNICODESTRING}
  EnsureEncoding(ASrcEncoding, encUTF8);
  FConnection.IOHandler.DefAnsiEncoding := ASrcEncoding;
{$ENDIF}
  WriteContent;
end;

@Molochnik
Copy link
Author

Everything goes as usual - all bugs are yours, if you think you found a bug in an outer library go back to the start of the phrase. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Element: HTTP Issues related to HTTP handling, TIdHTTP, TIdHTTPServer, TIdHTTPProxyServer, etc Status: Reported Issue has been reported for review Type: Bug Issue is a bug in existing code
Projects
None yet
Development

No branches or pull requests

2 participants