Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ssh/tailssh: fall back to using su when no TTY available on Linux #11910

Merged
merged 1 commit into from
May 29, 2024

Conversation

oxtoacart
Copy link
Contributor

@oxtoacart oxtoacart commented Apr 29, 2024

This allows pam authentication to run, triggering automation
like pam_mkhomedir.

Updates #11854

@@ -3,6 +3,10 @@ FROM ${BASE}

RUN groupadd -g 10000 groupone
RUN groupadd -g 10001 grouptwo
RUN useradd -g 10000 -G 10001 -u 10002 -m testuser
RUN useradd -g 10000 -G 10001 -u 10002 testuser
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing the -m flag keeps this from creating a home directory for the user (we want PAM to do that on login).

@@ -126,22 +127,7 @@ func (ss *sshSession) newIncubatorCommand() (cmd *exec.Cmd) {
if isShell {
incubatorArgs = append(incubatorArgs, "--shell")
}
// Only the macOS version of the login command supports executing a
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need to do this outside of the actual incubator process, so this has moved into tryLoginCmd.

@oxtoacart oxtoacart force-pushed the ox/11854-3 branch 2 times, most recently from 2655cc4 to a1ad9e5 Compare May 1, 2024 12:18
Base automatically changed from ox/11854-2 to main May 1, 2024 16:19
@oxtoacart oxtoacart requested review from maisem and bradfitz May 1, 2024 16:20
@oxtoacart oxtoacart changed the title ssh/tailssh: fall back to su when login is unavailable ssh/tailssh: fall back to using su when no TTY available on Linux May 1, 2024
@oxtoacart oxtoacart force-pushed the ox/11854-3 branch 7 times, most recently from f9f1be8 to 7c3424b Compare May 2, 2024 19:28
Copy link
Contributor

@awly awly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This links to https://github.com/tailscale/corp/issues/11854, which is probably the wrong issue.
It would be great to see a description of the problem this PR solves for context (sorry if I missed it somewhere!)

ssh/tailssh/incubator.go Outdated Show resolved Hide resolved
@oxtoacart
Copy link
Contributor Author

This links to tailscale/corp#11854, which is probably the wrong issue. It would be great to see a description of the problem this PR solves for context (sorry if I missed it somewhere!)

Sorry, wrong repo. See #11854.

ssh/tailssh/incubator.go Outdated Show resolved Hide resolved
ssh/tailssh/incubator.go Outdated Show resolved Hide resolved
ssh/tailssh/incubator.go Outdated Show resolved Hide resolved
ssh/tailssh/incubator.go Outdated Show resolved Hide resolved
Comment on lines 393 to 460
// First try to execute su -l <user> -c id to make sure su supports the
// necessary arguments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there versions of su that don't support these arguments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know about Linux, but my version of MacOS doesn't support -c.

SU(1)                                                                                                         General Commands Manual                                                                                                         SU(1)

NAME
     su – substitute user identity

SYNOPSIS
     su [-] [-flm] [login [args]]

ssh/tailssh/incubator.go Outdated Show resolved Hide resolved
Comment on lines 407 to 408
loginArgs = append(loginArgs, "-c", ia.cmdName)
loginArgs = append(loginArgs, ia.cmdArgs...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to pass the whole command+args to the -c flag:

Suggested change
loginArgs = append(loginArgs, "-c", ia.cmdName)
loginArgs = append(loginArgs, ia.cmdArgs...)
loginArgs = append(loginArgs, "-c", strings.Join(append([]string{ia.cmdName}, ia.cmdArgs...), " "))

from local testing:

❱ su -l awly -c echo test
Password:

❱ su -l awly -c "echo test"
Password:
test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly enough, I can't get this to fail in my integration test because SSH already passes the entire command and arguments as a single command string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah, on MacOS at least, /usr/bin/login suffers from the same issue.

➜  oss git:(ox/11854-3) ✗ sudo /usr/bin/login -f -pq -h 100.100.100.102 testuser /bin/zsh -c ls -l /tmp/sftptest.dat
Desktop		Documents	Downloads	Library		Movies		Music		Pictures	Public		sftptest.dat
➜  oss git:(ox/11854-3) ✗ sudo /usr/bin/login -f -pq -h 100.100.100.102 testuser /bin/zsh -c "ls -l /tmp/sftptest.dat"
-rw-r--r--@ 1 testuser  wheel  11 May  4 11:38 /tmp/sftptest.dat

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So yeah, long story short, we're okay here because the command and its args are already passed as a single string from the invocation of beincubator. I'll add a unit test that asserts this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've added a test that runs "echo test". I've also refactored the test to make it clearer what commands we're calling where, and whether they have arguments or not.

ssh/tailssh/incubator.go Outdated Show resolved Hide resolved
ssh/tailssh/incubator.go Outdated Show resolved Hide resolved
@oxtoacart oxtoacart force-pushed the ox/11854-3 branch 5 times, most recently from 8748e47 to b2b1cf8 Compare May 4, 2024 19:09
// findSU attempts to find an su command which supports the -l and -c flags.
// This actually calls the su command, which can cause side effects like
// triggering pam_mkhomedir.
func findSU(logf logger.Logf, ia incubatorArgs) (string, bool) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional: the only return values here are "", false and su, true.
maybe this can just return a single string value, and the called checks if it's empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, changed.

ssh/tailssh/incubator.go Outdated Show resolved Hide resolved
Comment on lines 499 to 541
func dropPrivileges(logf logger.Logf, ia incubatorArgs) error {
return doDropPrivileges(logf, ia.uid, ia.gid, ia.gids)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this wrapper seems redundant now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept it to avoid complicating the test code.

isShell = true
args = append(args, "-l") // login shell
}
isShell = ss.RawCommand() == ""
default:
panic(fmt.Sprintf("unexpected subsystem: %v", ss.Subsystem()))
}

if ss.conn.srv.tailscaledPath == "" {
// TODO(maisem): this doesn't work with sftp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this comment still accurate? It's confusing in any case. Can you delete or clarify what it means?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. We really should be returning an error here. I've updated the code and clarified the comment.

if !isSFTP {
loginShell := ss.conn.localUser.LoginShell()
args := shellArgs(isShell, ss.RawCommand())
logf("directly running %s %q", loginShell, args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where's this log to? Is this still needed or was it leftover debug?

Copy link
Contributor Author

@oxtoacart oxtoacart May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless debugIncubator is compiled to true or we're running in tests, this logs to Discard.

"--local-user=" + lu.Username,
"--remote-user=" + remoteUser,
"--remote-ip=" + ci.src.Addr().String(),
"--has-tty=false", // updated in-place by startWithPTY
"--tty-name=", // updated in-place by startWithPTY
}

forceV1Behavior := ss.conn.srv.lb.NetMap().SelfNode.HasCap(tailcfg.NodeAttrSSHBehaviorV1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might want a method on LocalBackend instead so you can easily guard against the small window where NetMap can turn nil on a logout/switch concurrent with an incoming SSH connection and then crash here on the SelfNode deref.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, nice! Done. The HasCap already guards agains a null SelfNode, but NetMap() can be nil, so your suggestion fixes that.

return unix.Exec(ia.loginCmdPath, la, os.Environ())
}
if !shouldAttemptLoginShell(logf, ia) {
logf("not attempting login shell")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same question

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logs nowhere in production builds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might name it vlogf then. That's what we do elsewhere for such things.

func tryLoginWithSU(logf logger.Logf, ia incubatorArgs) (bool, error) {
// Currently, we only support falling back to su on Linux. This
// potentially could work on BSDs as well, but requires testing.
if runtime.GOOS != "linux" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still like an explicit check for Gokrazy just for clarity that it's not going to need/try to find 'su'.

if err != nil {
t.Fatalf("can't open file: %s", err)
func fallbackToSUAvailable() bool {
if runtime.GOOS != "linux" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gokrazy here too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is testing code. Are you saying you want the integration test to be able to run on gocrazy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, okay. I guess I don't care about testing on gokrazy.

Comment on lines 417 to 423
sessionCloser, err := maybeStartLoginSession(logf, ia)
if err == nil && sessionCloser != nil {
defer sessionCloser()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ignoring err != nil case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's old code, but yeah, it seems to ignore the error case. I've removed the error return, since it's actually always nil anyway.

@oxtoacart
Copy link
Contributor Author

#11910 (comment)

That's dead code.

Copy link
Member

@bradfitz bradfitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved but see comments first.

Comment on lines 4654 to 4660
// ForceV1SSHBehavior indicates whether the original V1 SSH behavior should be used.
func (b *LocalBackend) ForceV1SSHBehavior() bool {
b.mu.Lock()
defer b.mu.Unlock()

return b.netMap != nil && b.netMap.SelfNode.HasCap(tailcfg.NodeAttrSSHBehaviorV1)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, turns out you don't need this. There's a method that already guards against a nil NetMap:

// HasCap reports whether nm is non-nil and nm.AllCaps contains c.
func (nm *NetworkMap) HasCap(c tailcfg.NodeCapability) bool {
	return nm != nil && nm.AllCaps.Contains(c)
}

So you can use that from the ssh code instead, keeping that code localized.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I like that better.

@@ -66,6 +66,7 @@ const (
// It is used for testing.
type ipnLocalBackend interface {
GetSSH_HostKeys() ([]gossh.Signer, error)
ForceV1SSHBehavior() bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... then you don't need to add this

@@ -2274,6 +2274,9 @@ const (
// depending on the destination address and the configured routes. When present, it also makes
// the DNS forwarder use UserDial instead of SystemDial when dialing resolvers.
NodeAttrUserDialUseRoutes NodeCapability = "user-dial-routes"

// NodeAttrSSHBehaviorV1 forces SSH to use the V1 behavior (no su, run SFTP in-process)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mention a date (yyyy-mm-dd) and Tailscale version number in here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do, and will also update CurrentCapabilityVersion.

@@ -77,6 +79,10 @@ func TestMain(m *testing.M) {
if err != nil {
return
}
defer func() {
// wait a little bit for logging to catch up.
time.Sleep(2 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sleeps in tests are never the right answer. They're either too short (flaky) or too long (extra pain for people running tests; they add up)

Also this comment doesn't justify it: catch up to what? What are we actually waiting for here?

Let's remove the sleep and add explicit synchronization if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's anything explicit to synchronize with, we're waiting for the tail command to read our logs. There's a delay between when the test terminates and tail actually reads the log file contents, hence the delay. I'll make the comment more explicit.

a.cmdArgs = flags.Args()
return a

for _, g := range strings.Split(groups, ",") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strings.Split("", ",") == []string{""} and will thus fail on L206. I guess this one at least (or all of them?) are expected to be non-empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not new code. AFAICT, on Unix systems, users will always have at least a primary group id, which means we should be fine here. We use these groups to masquerade as this user, so if there are no groups, it's not clear that we'd want to proceed anyway.

return unix.Exec(ia.loginCmdPath, la, os.Environ())
}
if !shouldAttemptLoginShell(logf, ia) {
logf("not attempting login shell")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might name it vlogf then. That's what we do elsewhere for such things.

// Unlike login, su often does not require a TTY, so on Linux hosts that have
// an su command which accepts the right flags, we'll use su instead of login
// when no TTY is available.
func trySU(logf logger.Logf, ia incubatorArgs) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably name these results, as this is an incommon signature. Maybe (hasSU bool, _ error)?

The caller does:

if handled, err := trySU(logf, ia); handled {
		return err

func tryLoginWithSU(logf logger.Logf, ia incubatorArgs) (bool, error) {
// Currently, we only support falling back to su on Linux. This
// potentially could work on BSDs as well, but requires testing.
if runtime.GOOS != "linux" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still prefer an explicit fast path for Gokrazy for clarity. No need to even try to find it.

Actually now that I think about it, it could be bad to even try to find it on gokrazy: when the 'breakglass' debug package is installed, it's possible there'd be a su binary (that might not work, just included as part of busybox) and we wouldn't want to behave differently in that case when debugging was in use.

@oxtoacart oxtoacart force-pushed the ox/11854-3 branch 2 times, most recently from 0385cfc to a9cd6a8 Compare May 29, 2024 17:56
This allows pam authentication to run for ssh sessions, triggering
automation like pam_mkhomedir.

Updates #11854

Signed-off-by: Percy Wegmann <[email protected]>
@oxtoacart
Copy link
Contributor Author

Thanks for the reviews @awly and @bradfitz !

@oxtoacart oxtoacart merged commit 08a9551 into main May 29, 2024
48 checks passed
@oxtoacart oxtoacart deleted the ox/11854-3 branch May 29, 2024 18:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants