Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport __ from v- #557

Open
ormaaj opened this issue Oct 13, 2022 · 5 comments
Open

Backport __ from v- #557

ormaaj opened this issue Oct 13, 2022 · 5 comments
Labels
enhancement New feature or request

Comments

@ormaaj
Copy link

ormaaj commented Oct 13, 2022

__ has many uses for accessing things that you can't get to any other way. I can't name them all because I almost certainly haven't discovered all that it does. In some contexts it behaves as a self-reference as you would expect _ to be used for, while in others it has a "parent-of-self" function.

One interesting use is to access the current element during a replacement expansion:

(cmd)gentoo@ormaaj-laptop (45134) /home/ormaaj $ ( echo; for sh in ./bin/ksh{93,2.1}; do "$sh" /dev/fd/9 9<<\EOF9; echo; done )
typeset -a a=(zero one)
a[3]=three a[4]=four
printf '%s ' "${a[@]/~(E)^/${!__}=}"; echo
print -rv -- .sh.version
EOF9

a[0]=zero a[1]=one a[3]=three a[4]=four
Version ABIJM 93v- 2014-12-24

__=zero __=one __=three __=four
Version AJM 93u+m/1.1.0-alpha+e920f42e/MOD 2022-09-27

Examples of other places where __ does "something":

  • .sh.math function definitions
  • compound variable assignments
  • Definitions and usages of typeset -T objects
  • in some cases, built-in types in the .sh namespace

To name a few. It also has the usual assortment of bewildering bugs that obfuscate its intended purpose.

The origin of __ was a thread in which I pointed out that _ within a get/set discipline should reference the current object analogous to the way "properties" or "accessors" work in other languages. https://marc.info/?l=ast-users&m=137448362609866&w=4

Evidently this usage conflicted with other people's ideas about what it should do, so __ was added to refer to the parent object in contexts where hierarchical structures apply. It seems to have been extended for other uses, none of which are documented, of course.

EDIT: Later in that thread I commented that __ should work like python's super() or something. No clue what I was thinking there. That's nonsense unless ksh adopts a multiple inheritance scheme, which I wouldn't propose. :/

@McDutchie McDutchie added the enhancement New feature or request label Oct 13, 2022
@McDutchie
Copy link

McDutchie commented Apr 2, 2023

Here is a first attempt to backport it from ksh 93v- 2013-08-07. The changes all involve sh.oldnp and np->nvenv.

First backport attempt
diff --git a/src/cmd/ksh93/include/name.h b/src/cmd/ksh93/include/name.h
index 74684894f..91a89858b 100644
--- a/src/cmd/ksh93/include/name.h
+++ b/src/cmd/ksh93/include/name.h
@@ -84,6 +84,7 @@ struct Namref
 {
 	Namval_t	*np;
 	Namval_t	*table;
+	Namval_t	*oldnp;
 	Dt_t		*root;
 	char		*sub;
 #if SHOPT_FIXEDARRAY
@@ -153,6 +154,7 @@ struct Ufunction
 #define nv_reftree(n)	((n)->nvalue.nrp->root)
 #define nv_reftable(n)	((n)->nvalue.nrp->table)
 #define nv_refsub(n)	((n)->nvalue.nrp->sub)
+#define nv_refoldnp(n)	((n)->nvalue.nrp->oldnp)
 #if SHOPT_FIXEDARRAY
 #   define nv_refindex(n)	((n)->nvalue.nrp->curi)
 #   define nv_refdimen(n)	((n)->nvalue.nrp->dim)
@@ -206,6 +208,7 @@ extern void		nv_outnode(Namval_t*,Sfio_t*, int, int);
 extern int		nv_subsaved(Namval_t*, int);
 extern void		nv_typename(Namval_t*, Sfio_t*);
 extern void		nv_newtype(Namval_t*);
+extern Namval_t		*nv_typeparent(Namval_t*);
 extern int		nv_istable(Namval_t*);
 extern size_t		nv_datasize(Namval_t*, size_t*);
 extern Namfun_t		*nv_mapchar(Namval_t*, const char*);
diff --git a/src/cmd/ksh93/include/shell.h b/src/cmd/ksh93/include/shell.h
index 7efad5109..b5b49a36a 100644
--- a/src/cmd/ksh93/include/shell.h
+++ b/src/cmd/ksh93/include/shell.h
@@ -285,6 +285,7 @@ struct Shell_s
 	Namval_t	*namespace;	/* current active namespace */
 	Namval_t	*last_table;	/* last table used in last nv_open */
 	Namval_t	*prev_table;	/* previous table used in nv_open */
+	Namval_t	*oldnp;		/* last valid parent node */
 	Sfio_t		*outpool;	/* output stream pool */
 	long		timeout;	/* read timeout */
 	unsigned int	curenv;		/* current subshell number */
diff --git a/src/cmd/ksh93/sh/name.c b/src/cmd/ksh93/sh/name.c
index 580205a88..80b6d1c46 100644
--- a/src/cmd/ksh93/sh/name.c
+++ b/src/cmd/ksh93/sh/name.c
@@ -704,6 +704,18 @@ static char *stack_extend(const char *cname, char *cp, int n)
 	return (char*)name;
 }
 
+static Namval_t *nv_parentnode(Namval_t *np)
+{
+	Namval_t *mp = np;
+	if(nv_istable(np))
+		return nv_parent(np);
+	if(mp=nv_typeparent(np))
+		return mp;
+	if((mp= (Namval_t*)np->nvenv))
+		return mp;
+	return np;
+}
+
 Namval_t *nv_create(const char *name,  Dt_t *root, int flags, Namfun_t *dp)
 {
 	char			*sub=0, *cp=(char*)name, *sp, *xp;
@@ -872,6 +884,9 @@ Namval_t *nv_create(const char *name,  Dt_t *root, int flags, Namfun_t *dp)
 			if(c)
 				*sp = c;
 			top = 0;
+			if(np && !nv_isattr(np,NV_MINIMAL) && sh.oldnp && !np->nvenv && sh.oldnp!=np)
+				np->nvenv = (char*)sh.oldnp;
+			sh.oldnp = np;
 			if(isref)
 			{
 #if SHOPT_FIXEDARRAY
@@ -898,6 +913,8 @@ Namval_t *nv_create(const char *name,  Dt_t *root, int flags, Namfun_t *dp)
 					n = nv_refindex(np);
 					dim = nv_refdimen(np);
 #endif /* SHOPT_FIXEDARRAY */
+					if(sh.oldnp = nv_refoldnp(np))
+						sh.oldnp = (Namval_t*)sh.oldnp->nvenv;
 					np = nv_refnode(np);
 #if SHOPT_FIXEDARRAY
 					if(n)
@@ -1208,7 +1225,25 @@ Namval_t *nv_create(const char *name,  Dt_t *root, int flags, Namfun_t *dp)
 				return np;
 			cp++;
 			break;
+		    case '_':
+			if(cp[1]=='_' && (cp[2]==0 || cp[2]=='.') && sh.oldnp)
+			{
+				cp += 2;
+				dp->last = cp;
+				nvcache.ok = 0;
+				sh.oldnp = np = nv_parentnode(sh.oldnp);
+				if(*cp==0)
+					return np;
+				sp = nv_name(np);
+				c = strlen(sp);
+				dp->nofree |= 1;
+				name = copystack(sp,cp,NULL);
+				cp = (char*)name+c+1;
+				break;
+			}
+			/* FALLTHROUGH */
 		    default:
+			sh.oldnp = np;
 			dp->last = cp;
 			if((c = mbchar(cp)) && !isaletter(c))
 				return np;
@@ -3384,6 +3419,7 @@ void nv_setref(Namval_t *np, Dt_t *hp, int flags)
 	np->nvalue.nrp = sh_newof(0,struct Namref,1,sizeof(Dtlink_t));
 	np->nvalue.nrp->np = nq;
 	np->nvalue.nrp->root = hp;
+	np->nvalue.nrp->oldnp = sh.oldnp;
 	if(ep)
 	{
 #if SHOPT_FIXEDARRAY
@@ -3531,14 +3567,17 @@ char *nv_name(Namval_t *np)
 	if(!nv_isattr(np,NV_MINIMAL|NV_EXPORT) && np->nvenv)
 	{
 		Namval_t *nq= sh.last_table, *mp= (Namval_t*)np->nvenv;
-		if(np==sh.last_table)
-			sh.last_table = 0;
-		if(nv_isarray(mp))
-			sfprintf(sh.strbuf,"%s[%s]",nv_name(mp),np->nvname);
-		else
-			sfprintf(sh.strbuf,"%s.%s",nv_name(mp),np->nvname);
-		sh.last_table = nq;
-		return sfstruse(sh.strbuf);
+		if(!strchr(np->nvname,'.') || nv_type(mp) || (nv_isarray(mp) && (cp=nv_getsub(mp)) && strcmp(np->nvname,cp)==0))
+		{
+			if(np==sh.last_table)
+				sh.last_table = 0;
+			if(nv_isarray(mp))
+				sfprintf(sh.strbuf,"%s[%s]",nv_name(mp),np->nvname);
+			else
+				sfprintf(sh.strbuf,"%s.%s",nv_name(mp),np->nvname);
+			sh.last_table = nq;
+			return sfstruse(sh.strbuf);
+		}
 	}
 	if(nv_istable(np))
 		sh.last_table = nv_parent(np);
diff --git a/src/cmd/ksh93/sh/nvtree.c b/src/cmd/ksh93/sh/nvtree.c
index 3aca14e51..b42e12cda 100644
--- a/src/cmd/ksh93/sh/nvtree.c
+++ b/src/cmd/ksh93/sh/nvtree.c
@@ -704,6 +704,8 @@ static void outval(char *name, const char *vname, struct Walk *wp)
 			if(fp = nv_stack(np,NULL))
 				free(fp);
 			np->nvfun = 0;
+			if(!nv_isattr(np,NV_MINIMAL))
+				np->nvenv = 0;
 			return;
 		}
 		for(xp=fp->next; xp; xp = xp->next)
@@ -1004,7 +1006,9 @@ static char *walk_tree(Namval_t *np, Namval_t *xp, int flags)
 			sh.var_tree = dp;
 			if(nq && mq)
 			{
+				char *nvenv = mq->nvenv;
 				nv_clone(nq,mq,flags|NV_RAW);
+				mq->nvenv = nvenv;
 				if(flags&NV_MOVE)
 					nv_delete(nq,walk.root,0);
 			}
diff --git a/src/cmd/ksh93/sh/nvtype.c b/src/cmd/ksh93/sh/nvtype.c
index 56db5998a..21e1db54d 100644
--- a/src/cmd/ksh93/sh/nvtype.c
+++ b/src/cmd/ksh93/sh/nvtype.c
@@ -447,9 +447,12 @@ static Namfun_t *clone_type(Namval_t* np, Namval_t *mp, int flags, Namfun_t *fp)
 					nv_addnode(nr,1);
 				if(pp->strsize<0)
 					continue;
-				_nv_unset(nr,0);
-				if(!nv_isattr(nr,NV_MINIMAL))
-					nv_delete(nr,sh.last_root,0);
+				if(nr != (Namval_t*)np->nvenv)
+				{
+					_nv_unset(nr,0);
+					if(!nv_isattr(nr,NV_MINIMAL))
+						nv_delete(nr,sh.last_root,0);
+				}
 			}
 			else if(nv_isattr(nq,NV_RDONLY) && !nq->nvalue.cp && !nv_isattr(nq,NV_INTEGER))
 			{
@@ -1509,3 +1512,14 @@ int	sh_outtype(Sfio_t *out)
 	dtinsert(sh.var_base,L_ARGNOD);
 	return 0;
 }
+
+Namval_t *nv_typeparent(Namval_t *np)
+{
+	Namchld_t *fp;
+	Namtype_t *tp;
+	if(fp = (Namchld_t*)nv_hasdisc(np,&chtype_disc))
+		return fp->ptype->parent;
+	if(tp = (Namtype_t*)nv_hasdisc(np,&type_disc))
+		return tp->parent;
+	return 0;
+}
diff --git a/src/cmd/ksh93/sh/subshell.c b/src/cmd/ksh93/sh/subshell.c
index 25e92cdee..3e216bf08 100644
--- a/src/cmd/ksh93/sh/subshell.c
+++ b/src/cmd/ksh93/sh/subshell.c
@@ -349,7 +349,7 @@ static void nv_restore(struct subshell *sp)
 		}
 		nv_setsize(mp,nv_size(np));
 		if(!(flags&NV_MINIMAL))
-			mp->nvenv = np->nvenv;
+			mp->nvenv = nv_isnull(mp) ? 0 : np->nvenv;
 		mp->nvfun = np->nvfun;
 		if(np->nvfun && nofree)
 			np->nvfun->nofree = nofree;

The patch makes your example script work:

typeset -a a=(zero one)
a[3]=three a[4]=four
printf '%s ' "${a[@]/~(E)^/${!__}=}"; echo

Output before patch: __=zero __=one __=three __=four . Output after patch: a[0]=zero a[1]=one a[3]=three a[4]=four .

However, the patch also causes a few regression test failures:

test pointtype begins at 2023-04-02+13:49:30
	pointtype.sh[109]: FAIL: expansion of associative array of types is incorrect
	pointtype.sh[111]: FAIL: typeset -p z for associative of types is incorrect
	pointtype.sh[132]: FAIL: expansion of type containing indexed array of types is incorrect
test pointtype failed at 2023-04-02+13:49:30 with exit code 3 [ 36 tests 3 errors ]
test types begins at 2023-04-02+13:50:29
/usr/local/src/ksh93/ksh/src/cmd/ksh93/tests/types.sh[33]: Type_t: r: is read only
test types failed at 2023-04-02+13:50:29 with exit code 1 [ 100 tests 1 error ]

Of course I may well have missed something while backporting this, so please do verify the patch against the 93v- commit linked above.

There should also be a fix for it in ksh 93v- 2013-08-29 ("Fixed __ to work in nested types") but I haven't managed to isolated that from all the other changes in that commit yet. I suspect the fix involves an (ab)use of the NV_EXPORT attribute. It's so unfortunate they never kept changes in distinct commits :/

@McDutchie
Copy link

Having said all that, I'm not convinced this is worth integrating into ksh 93u+m. I don't think an interesting hack is worth it when it comes with "the usual assortment of bewildering bugs that obfuscate its intended purpose"; we prioritise fixing bugs here, so I'd like to avoid introducing any where possible.

In the example: printf '%s ' "${a[@]/~(E)^/${!__}=}" (which, as of ceae1e4 on the dev branch, can also be written as printf '%s ' "${a[@]/#/${!__}=}"), the ${!__} expansion does not have a constant value; it re-expands differently for each iteration of the vector expansion. This violates a basic principle of shell grammar, so I'm not a fan of that at all. There is a much cleaner way to achieve the same thing:

typeset -a a=(zero one)
a[3]=three a[4]=four
for i in "${!a[@]}"
do	printf 'a[%d]=%s ' i "${a[i]}"
done; echo

…and that works on every ksh93, and makes for much more legible and understandable code.

As for the use in types, in the mailing list thread, you mentioned a workaround that you called "nasty": nameref this=${.sh.name%.*}; .sh.value=${this.y};. Two questions:

  1. Why is that nasty?
  2. Is it nastier than the 93v- __ hack?

I could be convinced to backport __, and try to fix it, if you can present an actual, concrete use case that is not feasible to implement in ksh as it is.

@McDutchie
Copy link

the ${!__} expansion does not have a constant value; it re-expands differently for each iteration of the vector expansion. This violates a basic principle of shell grammar, so I'm not a fan of that at all.

…on the other hand, that turns out to be a pre-existing quirk of ksh, and not something introduced by this patch.

$ ksh -c 'echo ${@/~(E)^/$((++i))}' _ one two three four
1one 2two 3three 4four

whereas:

$ bash -c 'echo ${@/#/$((++i))}' _ one two three four
1one 1two 1three 1four
$ zsh -c 'echo ${@/#/$((++i))}' _ one two three four 
1one 1two 1three 1four

I would consider the ksh93 behaviour to be clearly a bug, but there are probably scripts that depend on it. :-/

@ormaaj
Copy link
Author

ormaaj commented Apr 13, 2023

the ${!__} expansion does not have a constant value; it re-expands differently for each iteration of the vector expansion. This violates a basic principle of shell grammar, so I'm not a fan of that at all.

…on the other hand, that turns out to be a pre-existing quirk of ksh, and not something introduced by this patch.

$ ksh -c 'echo ${@/~(E)^/$((++i))}' _ one two three four
1one 2two 3three 4four

whereas:

$ bash -c 'echo ${@/#/$((++i))}' _ one two three four
1one 1two 1three 1four
$ zsh -c 'echo ${@/#/$((++i))}' _ one two three four 
1one 1two 1three 1four

I would consider the ksh93 behaviour to be clearly a bug, but there are probably scripts that depend on it. :-/

It's definitely a feature. The other shells just chose not to implement it for whatever reason. It's required in order to access sub-matches from .sh.match that aren't accessible through ordinary group references, among other uses. There was at one time a bug in which the parameter expansions quit working and a fix was required to restore that behavior.

@ormaaj
Copy link
Author

ormaaj commented Apr 14, 2023

…and that works on every ksh93, and makes for much more legible and understandable code.

As for the use in types, in the mailing list thread, you mentioned a workaround that you called "nasty": nameref this=${.sh.name%.*}; .sh.value=${this.y};. Two questions:

1. Why is that nasty?

2. Is it nastier than the 93v- `__` hack?

I could be convinced to backport __, and try to fix it, if you can present an actual, concrete use case that is not feasible to implement in ksh as it is.

I did not (and still don't) understand Irek and other's objections in that thread. I created that thread due to the misbehavior of _ and I'm pretty sure that must have been some oversight because I can't think of any reason for _ to not behave as I proposed (and DGK agreed and fixed it).

From that thread...

This can't work. .sh.value is identical to _

Is that even true. If so why? Why _ would need to be a synonym for .sh.value.

Think about ksh93 types as compound variables with predefined
discipline functions. In your example you're trying to assign the
value of ++_.y to the value of a compound variable - which obviously
cannot work.

I have no clue what was meant here. A type is nothing like a compound variable other than superficially borrowing bits of the compound assignment grammar for their definitions. Predefined discipline functions? What? I didn't even use a compound variable in that example and _.y definitely doesn't refer to one there.

What about . to access the parent in both types and compound variable trees?

What on earth is a type tree? Why would a type have a parent? Compound variables sure.

I think the _ bug that was the topic there and the introduction of __ are totally separate issues. __ is only really interesting for its new functionality. It does make sense to have a way to access the current variable with _ in the case of compounds and the parent through __. With regard to types I don't think it was ever discussed what role __ would have either as a member variable or in the context of a type's functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants