Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework ANSIString to more generic EvString #3253

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

InspectorCaracal
Copy link
Contributor

@InspectorCaracal InspectorCaracal commented Aug 8, 2023

Brief overview of PR changes/additions

This will replace the telnet-focused ANSIString with an equivalent EvString built around Evennia markup.

As of this initial PR, the EvString class parses around Evennia markup rather than ANSI codes, and has methods for converting its data to the separate protocols (at the moment ansi and html). Those methods are then used in their respective protocol modules to convert the markup on sending.

The further goal is to integrate that approach and the EvString class into the other evennia formatting classes such as EvTable, but I haven't yet reviewed them. The tests have also not been updated yet, so there's no guarantees yet that this is fully working.

Motivation for adding to Evennia

The goal is to convert all of the server-side formatting to be entirely protocol agnostic and only use the defined Evennia style codes, such as |r, and have those codes converted into protocol-friendly forms from the actual protocol module itself.

Other info (issues closed, discussion etc)

See #3185 for the discussion.

@InspectorCaracal
Copy link
Contributor Author

A question for this: should converting an EvString or EvStringContainer (such as EvTable) to a str object render it with the markup, or without the markup?

@InspectorCaracal
Copy link
Contributor Author

InspectorCaracal commented Aug 8, 2023

Some other questions/issues that arose, for reference by me/others.

One EvString renders as equal to another if there is "invisible" markup at the ends - e.g. EvString("|relectric |c") is identified as equal to EvString("|relectric "). This shouldn't be the case - while they may render the same on their own, when adding another string like test to it would result in |relectric |ctest versus |relectric test which is clearly not equal. I haven't identified why it is yet to make sure I fix it properly.

Also, while working on updating forms, I discovered that EvString slices behave very unpredictably... At minimum, when taking a slice such as mystr[9:] it's including all of the codes from the first segment of the string that's not being included. The code affixing doesn't seem to occur for codes in the remaining string when doing something like mystr[:9] - only when taking a tail slice (although also probably prefixing a center slice). Based on some strange outputs I'd noticed in formatting codes previously with ANSIString, I think this has been happening all along but the change in parsing has made it more of a problem.

There's also a question of what should happening with leading and trailing color codes. Currently it's designed to preserve trailing color codes when slicing, but I think that they should only be preserved on stripping and the slice should occur before the code, since they're essentially switches, not open/close tags.

i.e. |relectric |cboogaloo|n should slice so that [:9] is |relectric and [9:] is |cboogaloo|n, rather than what currently seems intended, which is |relectric |c and |cboogaloo|n

Copy link
Member

@Griatch Griatch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great start of this project. I wonder if this kind of change is something one should merge directly into main or if one need to have this in a separate develop branch for a while; either way doing this change would mean a major version change (e.g. to Evennia 3.0.0).
The original implementation of ANSIString (which, as far as I understand was retained here with different internal storage) was itself based on the state of the Python string library in Python 2.x. So one should probably review what has happened in string since then and possibly refactor/clean up/modernize ANSISTring (EvString) as needed; possibly some of the weirdness comes from this (There is no concept of a separate unicode string class anymore for example).


# ------------------------------------------------------------
#
# EvString - ANSI-aware string class
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this is aware of evennia-style |-formatting rather than ANSI per-se, so this docstring is a bit misleading.

#
def _to_evstring(obj):
"""
convert to ANSIString.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer an ANSIString ;)

@InspectorCaracal
Copy link
Contributor Author

This still has a lot of work left - there's a lot of code to update and tests to work through - but since it came up recently and I was working on it more today, I figured I'd offer some reassurance that I haven't forgotten about it by committing and pushing my current progress.

@InspectorCaracal
Copy link
Contributor Author

A question for this: should converting an EvString or EvStringContainer (such as EvTable) to a str object render it with the markup, or without the markup?

Without the markup. EvString is the one handling the markup, accessing the str is accessing the internal value. repr could include the markup but converting EvString to html/others is not EvString's job.

That said, serializing the EvString should not simply serialize a str, it will need to be customized.

Sorry, I don't follow; what does converting an EvString to a normal string have to do with html?

At the moment, I convert it with the markup. e.g. str(EvString("some |rred|n text")) would be "some |rred|n text". I figured that it makes the most sense to flatten back to the original string than to lose information.

@InspectorCaracal
Copy link
Contributor Author

@ChrisLR Oh! You're misunderstanding the use case xD this is for things like printing to logs and other situations that just want the actual flat string

If you take a look at the PR's changes to the telnet and webclient protocols and at the evstring module, you'll see that it already implements sensible protocol-format methods that are called when about to be sent down the wire, which themselves use separate renderer classes built from the previous ANSIParser and Text2Html classes

The intent is that the EvString and its containers, such as tables, will be kept as that object while being passed around internally.

@ChrisLR
Copy link
Contributor

ChrisLR commented Sep 11, 2023

@InspectorCaracal my bad then I must've missed something somewhere 😅

@Griatch
Copy link
Member

Griatch commented Oct 29, 2023

@InspectorCaracal This one will also need a rebase as well, due to the changes of the upstream repo to reduce its size.

@Griatch
Copy link
Member

Griatch commented Nov 18, 2023

@InspectorCaracal Is this still in progress?

@InspectorCaracal
Copy link
Contributor Author

@Griatch Yep! I'm just on a particularly tricky part and I've been a lot busier since the school year started.

@InspectorCaracal
Copy link
Contributor Author

Looks like I messed up a few of the conflicts when rebasing 😩 I'll make sure to clean that up before this leaves draft stage

@InspectorCaracal
Copy link
Contributor Author

InspectorCaracal commented Dec 7, 2023

previous comment removed

What I thought was the issue was not the issue! The issue is, in fact, that all of the msg methods convert all outgoing objects to string before passing them on to the portal. I'll be modifying that next.

@fariparedes
Copy link

fariparedes commented Jan 31, 2024

Currently on this branch, any message with MXP markup in it will silently fail to send to the user.

Fails with this error:

|Portal| 2024-01-30 22:29:48 [!!]   File "evennia\evennia\utils\ansi.py", line 462, in convert_markup
|Portal| 2024-01-30 22:29:48 [!!]     output.append(self.convert_mxp(text, link_type=link.key, link_value=link.link))
|Portal| 2024-01-30 22:29:48 [!!]                   ^^^^^^^^^^^^^^^^
|Portal| 2024-01-30 22:29:48 [!!] AttributeError: 'RenderToANSI' object has no attribute 'convert_mxp'. Did you mean: 'convert_markup'?

The error is caused by a call to convert_mxp(), which does not exist anymore. Replacing the function call with evennia.server.portal.mxp.mxp_parse(text) fixes the issue. However, MXP will still not display properly. To fix that, you need to replace the code block like so:

elif isinstance(chunk, EvLink):
	if mxp:
		output.append(mxp_parse(chunk.raw()))
	else:
		link = chunk.data()
		text = _EVSTRING(link.text, ansi=self).to_ansi()
		output.append(text)

This preserves the MXP data for the portal to parse. Otherwise, it only passes the link text.

@fariparedes
Copy link

At the end of EvString._split_codes, the raw string is improperly set if the EvString contains MXP. The proper code would look like this:
self._raw_string = ''.join([str(c) if type(c) != EvLink else c.raw() for c in final_chunks])

Otherwise, MXP markup is discarded.

"""
if len(self._code_chunks) == 1:
if not len(self._code_chunks[0]):
return EvString(self._raw_string, chunks=self.code_chunks)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I presume that this is supposed to be self._code_chunks, not self.code_chunks (ditto for lstrip and strip).

left_chunks = []
stripped = None
# iterate through from the start until we find a chunk that's not an EvCode
for i, item in enumerate(self._code_chunks):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something about this loop or enumerate in general appears to convert EvLinks to plain strings. As EvLink does not retain its MXP data when converted to a string, this strips the MXP tags from the EvString.

Also, EvLink doesn't appear to have its own strip functions? Maybe related?

e: Yep, adding the following functions to EvLink fixes it being stripped of MXP here:

def strip(self, chars=None):
	"""Returns a new EvLink with the link text stripped."""
	result = self._ev_string.strip(chars)
	return EvLink(result, link_value=self._link_string, link_key=self._link_key)

def lstrip(self, chars=None):
	"""Returns a new EvLink with the link text stripped on the left."""
	result = self._ev_string.lstrip(chars)
	return EvLink(result, link_value=self._link_string, link_key=self._link_key)

def rstrip(self, chars=None):
	"""Returns a new EvLink with the link text stripped on the right."""
	result = self._ev_string.rstrip(chars)
	return EvLink(result, link_value=self._link_string, link_key=self._link_key)

"""
text = args[0]
if not isinstance(text, str):
text = to_str(text)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to_str() strips MXP from EvLinks, unlike EvCode, because it converts them to strings, so MXP will not survive being added to an EvString, e.g. by joining chunks. Perhaps EvLink should have a __str__ function which returns self.raw() to avoid this?

e: Actually, since EvLink descends from str, this will never be called for it. But the MXP is still stripped whenever you call EvString(EvLink). You could add another if statement:

if isinstance(text, EvLink):
	text = text.raw()

Not sure if this will be enough to stop MXP codes from ever being mistakenly lost, though. Maybe EvLink should descend from EvString instead of just str, since it is essentially a wrapper around an EvString?

return (self, '', '')

result = self._ev_string.partition(sep)
return ( EvLink(result[0], link_value=self._link_string, link_key=self._link_key), result[1], result[2] )
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you would want to wrap EvLink() around all of these in order to retain the unbroken link functionality, possibly even the separator. Otherwise, the link sticks only to the first part, which would not be the expected behavior e.g. if it were wrapped.

e: Splitting EvLinks is actually pretty tricky. You'd need to add something to EvString's join() to recognize equivalent EvLinks and merge them together into one chunk. In short:

>>> EvString(' ').join([EvString('|g|lcsay hi|ltsay|le'), EvString('|lcsay hi|lthi|le|w')])
EvString('|g|lcsay hi|ltsay hi|le|w')

And even this :

>>> EvString(' ').join([EvString('|g|lcsay hi|ltsay|le'), EvString('|lcsay hi|lt|bhi|le|w')])
EvString('|g|lcsay hi|ltsay |bhi|le|w')
# Optional challenge mode
>>> EvString(' ').join([EvString('|g|lcsay hi|ltsay|le'), EvString('|b|lcsay hi|lthi|le|w')])
EvString('|g|lcsay hi|ltsay |bhi|le|w')
>>> EvString(' ').join([EvString('|g|lcsay hi|ltsay|le|b'), EvString('|lcsay hi|lthi|le|w')])
EvString('|g|lcsay hi|ltsay|b hi|le|w')

Perhaps something like this (but better):

for item in iterable:
	if not isinstance(item, EvString):
		item = EvString(item)
	if last_item is not None:
		if len(item._code_chunks) == 1 and len(last_item._code_chunks) == 1:
			first_chunk = last_item._code_chunks[0]
			second_chunk = item._code_chunks[0]
			# maybe a helper method which checks if first_chunk's MXP is equivalent to second_chunk's?
			if isinstance(first_chunk, EvLink) and isinstance(second_chunk, EvLink) and first_chunk._link_string == second_chunk._link_string and first_chunk._link_key == second_chunk._link_key:
				# must convert the raw string into an EvLink in order to bridge the chunks
				result += EvString(EvLink(self._raw_string, link_string = first_chunk._link_string, link_key = second_chunk._link_key))
				result += item
				last_item = item
				continue
		result += self._raw_string
	result += item
	last_item = item

And then in _adder():

# if both EvStrings are joining on equivalent EvLinks, we combine the connecting chunks into one
elif EvLink == type(first._code_chunks[-1]) == type(second._code_chunks[0]) and first._code_chunks[-1]._link_string == second._code_chunks[0]._link_string and first._code_chunks[-1]._link_key == second._code_chunks[0]._link_key:
	glue = EvLink(first._code_chunks[-1]._ev_string + second._code_chunks[0]._ev_string, link_string = first._code_chunks[-1]._link_string, link_key = first._code_chunks[-1]._link_key)
	code_chunks = first._code_chunks[:-1] + (glue,) + second._code_chunks[1:]

first = self._code_chunks[:i] + (result[0],)
last = (result[2],) + self._code_chunks[i+1:]
# create new EvStrings from our partitioned results and return
return ( EvString(''.join(first), chunks=first), sep, EvString(''.join(last), chunks=last), )
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite your own advice in EvString.join()'s doc string, you are joining EvStrings with str.join() here, and in several other places in the file. ;) Maybe the chunks property is intended to preserve the metadata? But I think it may be breaking MXP.

@InspectorCaracal
Copy link
Contributor Author

I've been arguing with myself for a while on this, because this conversion and task is something I feel strongly about getting completed, but I think it's time for me personally to throw in the towel on this PR.

I think that the fundamentals and direction I've taken here are sound, but I don't have the technical knowledge or time necessary to work through the remaining issues, particularly those regarding the way ANSIString is integrated into the more complex formatting classes like EvForm and EvTable.

In other words: I'm forced to admit that this is as far as I can take it.

I'll leave my branch intact in case someone else can and wants to pick up and finish what I've started. The PR itself can be left as a draft or closed.

@Griatch
Copy link
Member

Griatch commented Mar 30, 2024

@InspectorCaracal Ok, no worries! Thanks for your work so far! As you said, let's keep this branch open for now so your work can act as a basis for future development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants