Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GSoC][RFC/Patch] FIX: use utf8_strnwidth for line_prefix in diff.c #1653

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

InnocentZero
Copy link

@InnocentZero InnocentZero commented Jan 23, 2024

This patch adresses diff.c:2721 and proposes the fix using a new function.

cc: Christian Couder [email protected]

This patch adresses diff.c:2721 and proposes the fix using a new function.

Signed-off-by: Md Isfarul Haque <[email protected]>
@dscho
Copy link
Member

dscho commented Jan 23, 2024

@InnocentZero please force-push to update PRs, don't close them and open a new one. Thank you!

Signed-off-by: Md Isfarul Haque <[email protected]>
@InnocentZero
Copy link
Author

@InnocentZero please force-push to update PRs, don't close them and open a new one. Thank you!

I understand. I apologize for the confusion.

@InnocentZero
Copy link
Author

@dscho can you /allow me again? It seems like the last time failed due to a private email.

@dscho
Copy link
Member

dscho commented Jan 24, 2024

That should not have prevented you from being allowed. Try /preview.

@InnocentZero
Copy link
Author

/preview

Copy link

Error: Could not determine full name of InnocentZero

@InnocentZero
Copy link
Author

/preview

Copy link

Preview email sent as [email protected]

@InnocentZero
Copy link
Author

/submit

Copy link

Submitted as [email protected]

To fetch this version into FETCH_HEAD:

git fetch https://github.com/gitgitgadget/git/ pr-git-1653/InnocentZero/diff_needswork-v1

To fetch this version to local tag pr-git-1653/InnocentZero/diff_needswork-v1:

git fetch --no-tags https://github.com/gitgitgadget/git/ tag pr-git-1653/InnocentZero/diff_needswork-v1

Copy link

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Md Isfarul Haque via GitGitGadget" <[email protected]> writes:

> This patch adresses diff.c:2721 and proposes the fix using a new function.

Yay.  My favorite long-time pet-peeve topic.

>
> Md Isfarul Haque (2):
>   FIX: use utf8_strnwidth for line_prefix in diff.c
>   FIX memory leak in one branch

Our convention does not use "FIX" and other prefix on the subject.
Please check Documentation/SubmittingPatches.

Thanks.

@@ -2300,6 +2300,20 @@ const char *diff_line_prefix(struct diff_options *opt)
return msgbuf->buf;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Christian Couder wrote (reply to this):

On Wed, Jan 24, 2024 at 3:04 PM Md Isfarul Haque via GitGitGadget
<[email protected]> wrote:
>
> From: Md Isfarul Haque <[email protected]>
>
> This patch adresses diff.c:2721 and proposes the fix using a new function.

Please give more details here about what is currently at diff.c:2721
and what the patch is fixing, as lines at diff.c:2721 could move to a
different location if some changes are made to diff.c before your
patches get merged or after they get merged.

Also if the patch is addressing an issue in a code comment I would
expect the corresponding code comment to be removed by the patch.

About the subject, maybe "diff: use utf8_strnwidth() for line_prefix"
is already better.

> Signed-off-by: Md Isfarul Haque <[email protected]>
> ---
>  diff.c | 18 ++++++++++++++++--
>  diff.h |  1 +
>  2 files changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/diff.c b/diff.c
> index a89a6a6128a..e3223b8ce5b 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -2300,6 +2300,20 @@ const char *diff_line_prefix(struct diff_options *opt)
>         return msgbuf->buf;
>  }
>
> +const struct strbuf *diff_line_prefix_buf(struct diff_options *opt)

This function seems to be used only in diff.c, so it could be static.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Md Isfarul Haque wrote (reply to this):

On 1/25/24 01:31, Christian Couder wrote:
> On Wed, Jan 24, 2024 at 3:04 PM Md Isfarul Haque via GitGitGadget
> <[email protected]> wrote:
>>
>> From: Md Isfarul Haque <[email protected]>
>>
>> This patch adresses diff.c:2721 and proposes the fix using a new function.
> 
> Please give more details here about what is currently at diff.c:2721
> and what the patch is fixing, as lines at diff.c:2721 could move to a
> different location if some changes are made to diff.c before your
> patches get merged or after they get merged.
> 
> Also if the patch is addressing an issue in a code comment I would
> expect the corresponding code comment to be removed by the patch.

I understand and apologize for the mess-up. I will keep it in mind next time. 

> About the subject, maybe "diff: use utf8_strnwidth() for line_prefix"
> is already better.
> 
>> Signed-off-by: Md Isfarul Haque <[email protected]>
>> ---
>>  diff.c | 18 ++++++++++++++++--
>>  diff.h |  1 +
>>  2 files changed, 17 insertions(+), 2 deletions(-)
>>
>> diff --git a/diff.c b/diff.c
>> index a89a6a6128a..e3223b8ce5b 100644
>> --- a/diff.c
>> +++ b/diff.c
>> @@ -2300,6 +2300,20 @@ const char *diff_line_prefix(struct diff_options *opt)
>>         return msgbuf->buf;
>>  }
>>
>> +const struct strbuf *diff_line_prefix_buf(struct diff_options *opt)
> 
> This function seems to be used only in diff.c, so it could be static.

As Junio pointed out, I will probably use the existing function with slight
modifications and use it. Besides, having unnecessary allocations and frees
will probably only add overhead.

-- 

Thanks and regards
Md. Isfarul Haque

Copy link

User Christian Couder <[email protected]> has been added to the cc: list.

@@ -2300,6 +2300,20 @@ const char *diff_line_prefix(struct diff_options *opt)
return msgbuf->buf;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Md Isfarul Haque via GitGitGadget" <[email protected]> writes:

> From: Md Isfarul Haque <[email protected]>
>
> This patch adresses diff.c:2721 and proposes the fix using a new function.

Once the issue has fully been addressed, it is expected that the
NEEDSWORK comment there would be removed, making this proposed log
message useless.  Make it a habit to write a log message that is
self-contained enough to help readers (including yourself in the
future when you have forgotten the details of what you did in this
commit).

> +const struct strbuf *diff_line_prefix_buf(struct diff_options *opt)
> +{

Given that there is only one caller of this function in the same
file, I do not see a reason why this needs to be extern and exported
in diff.h (actually I do not see a need for this helper at all).

When dealing with a string buffer, it is much more common in this
codebase for the caller to prepare a strbuf (often on its stack) and
pass a pointer to it to helper functions.  I.e.

	static void prepare_diff_line_prefix_buf(struct strbuf *buf,
						struct diff_options *opt)
	{
		... stuff whatever you need into the string buffer ...
                strbuf_add(buf, ...);
	}

	/* in show_stats() */
	struct strbuf line_prefix = STRBUF_INIT;
	...
	prepare_diff_line_prefix_buf(&line_prefix, options);
	... use line_prefix and ...
	... release the resource before returning ...
	strbuf_release(&line_prefix);
	
is more common and less prone to resource leak over time.

> @@ -2635,7 +2649,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>  	int width, name_width, graph_width, number_width = 0, bin_width = 0;
>  	const char *reset, *add_c, *del_c;
>  	int extra_shown = 0;
> -	const char *line_prefix = diff_line_prefix(options);
> +	const struct strbuf *line_prefix = diff_line_prefix_buf(options);
>  	struct strbuf out = STRBUF_INIT;
>  
>  	if (data->nr == 0)
> @@ -2718,7 +2732,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>  	 * used to correctly count the display width instead of strlen().
>  	 */
>  	if (options->stat_width == -1)
> -		width = term_columns() - strlen(line_prefix);
> +		width = term_columns() - utf8_strnwidth(line_prefix->buf, line_prefix->len, 1);

I do not see the need for any of the diff_line_prefix_buf() related
changes, only to do this change.  You have a const char *line_prefix
at this point, and utf8_strnwidth() wants to know its length, so
what you need is to massage the parameter to match what it wants.
Perhaps even something simple and dumb like

	utf8_strnwidth(line_prefix, strlen(line_prefix), 1);

might be sufficient to replace strlen(line_prefix) in the original?

This patch hopefully will change the behaviour of the command.  A
patch needs to also protect the change from future breakages by
adding a test or two to demonstrate the desired behaviour.  Such a
test should pass with the code change in the patch, and should fail
when the code change in the patch gets reverted.

Thanks.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Md Isfarul Haque wrote (reply to this):

On 1/25/24 01:38, Junio C Hamano wrote:
> "Md Isfarul Haque via GitGitGadget" <[email protected]> writes:
> 
>> From: Md Isfarul Haque <[email protected]>
>>
>> This patch adresses diff.c:2721 and proposes the fix using a new function.
> 
> Once the issue has fully been addressed, it is expected that the
> NEEDSWORK comment there would be removed, making this proposed log
> message useless.  Make it a habit to write a log message that is
> self-contained enough to help readers (including yourself in the
> future when you have forgotten the details of what you did in this
> commit).
> 

I understand. Sorry for the mess-up. I will keep it in mind the next time.

>> +const struct strbuf *diff_line_prefix_buf(struct diff_options *opt)
>> +{
> 
> Given that there is only one caller of this function in the same
> file, I do not see a reason why this needs to be extern and exported
> in diff.h (actually I do not see a need for this helper at all).
> 
> When dealing with a string buffer, it is much more common in this
> codebase for the caller to prepare a strbuf (often on its stack) and
> pass a pointer to it to helper functions.  I.e.
> 
> 	static void prepare_diff_line_prefix_buf(struct strbuf *buf,
> 						struct diff_options *opt)
> 	{
> 		... stuff whatever you need into the string buffer ...
>                 strbuf_add(buf, ...);
> 	}
> 
> 	/* in show_stats() */
> 	struct strbuf line_prefix = STRBUF_INIT;
> 	...
> 	prepare_diff_line_prefix_buf(&line_prefix, options);
> 	... use line_prefix and ...
> 	... release the resource before returning ...
> 	strbuf_release(&line_prefix);
> 	
> is more common and less prone to resource leak over time.
> 

Ah, this is indeed very neat. Didn't strike me. I'm not extremely familiar
with the codebase and was unaware of this practice. I will follow this 
pattern in the future.

>> @@ -2635,7 +2649,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>>  	int width, name_width, graph_width, number_width = 0, bin_width = 0;
>>  	const char *reset, *add_c, *del_c;
>>  	int extra_shown = 0;
>> -	const char *line_prefix = diff_line_prefix(options);
>> +	const struct strbuf *line_prefix = diff_line_prefix_buf(options);
>>  	struct strbuf out = STRBUF_INIT;
>>  
>>  	if (data->nr == 0)
>> @@ -2718,7 +2732,7 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options)
>>  	 * used to correctly count the display width instead of strlen().
>>  	 */
>>  	if (options->stat_width == -1)
>> -		width = term_columns() - strlen(line_prefix);
>> +		width = term_columns() - utf8_strnwidth(line_prefix->buf, line_prefix->len, 1);
> 
> I do not see the need for any of the diff_line_prefix_buf() related
> changes, only to do this change.  You have a const char *line_prefix
> at this point, and utf8_strnwidth() wants to know its length, so
> what you need is to massage the parameter to match what it wants.
> Perhaps even something simple and dumb like
> 
> 	utf8_strnwidth(line_prefix, strlen(line_prefix), 1);
> 
> might be sufficient to replace strlen(line_prefix) in the original?

It was more of a force of habit on my end, since I usually do not use
functions that do not have a limit on the length they are reading.
However, considering that the string is generated by another function
and is most likely safe as it was used earlier, I will implement 
this suggestion.

> 
> This patch hopefully will change the behaviour of the command.  A
> patch needs to also protect the change from future breakages by
> adding a test or two to demonstrate the desired behaviour.  Such a
> test should pass with the code change in the patch, and should fail
> when the code change in the patch gets reverted.
> 

Alright. Where should I add the test? A new/existing test in t/t4013 
or t4124-log-graph-octopus.sh?

-- 

Thanks and regards,
Md Isfarul Haque

if (!opt->output_prefix){
msgbuf->buf = "";
msgbuf->len = 0;
msgbuf->alloc = 1;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Md Isfarul Haque via GitGitGadget" <[email protected]> writes:

> From: Md Isfarul Haque <[email protected]>
>
> Signed-off-by: Md Isfarul Haque <[email protected]>
> ---
>  diff.c | 1 +
>  1 file changed, 1 insertion(+)

We do not need to see that you are just as human as other developers
and are prone to make mistakes that you need to fix it in a
follow-up.  In other words, please do not introduce a bug in [1/2]
only to be fixed in [2/2].  By squashing these two patches into one,
you can pretend as if you are a perfect developer and never leaked
memory ;-).

> diff --git a/diff.c b/diff.c
> index e3223b8ce5b..9fa00103a6b 100644
> --- a/diff.c
> +++ b/diff.c
> @@ -2309,6 +2309,7 @@ const struct strbuf *diff_line_prefix_buf(struct diff_options *opt)
>  		msgbuf->alloc = 1;
>  	}
>  	else {
> +		free(msgbuf);
>  		msgbuf = opt->output_prefix(opt, opt->output_prefix_data);
>  	}
>  	return msgbuf;

But as I said, I do not see a need for this helper function in the
first place, so...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants