Jeff Duntemann's Contrapositive Diary Rotating Header Image

November, 2022:

A libc Mystery…Solved

We have solved the mystery described in yesterday’s entry…

…mostly. I’ve found down the years that inside any big mystery are likely one or more smaller mysteries. And so it is. I would have figured this problem out a whole lot sooner if the symptoms had been consistent.

They weren’t. And those symptoms made me nuts for several days. Eventually I decided to yell for help.

I got a lot of very good help. If you haven’t read yesterday’s entry (and if you’re actually interested in assembly language programming) go read it now. I won’t repeat all the details here.

In short: I wrote a small demo program for my new book, x64 Assembly Language Step By Step. It didn’t work. Several of my readers took the code I posted in yesterday’s entry, built the executable, and…it worked.

That’s what made me nuts. I ran the damned thing on three different Linux instances, and the problem manifested on all three of them. But a couple of my friends ran the executable and had no trouble at all. It worked perfectly.

WTF?

That’s actually the small mystery inside the big mystery. The big mystery we figured out fairly quickly. Bruce, a new Contra commenter, built the executable and it failed. He changed one line in the program, and it worked. I tried his fix. It worked. Mystery solved.

But…why? Bruce cleared register RDI to null (i.e., 0) before calling the libc time function. I had cleared RAX, as part of an earlier test to try and pin down the symptoms. I intended to remove that line from the program. But it gave Bruce an idea: clear RDI instead. He did. It worked. I tried it, and…victory! Clearing RDI to 0 completely eliminated the problem, and I spent another hour trying various things to crash the executable. No luck. It was a consistent fix, in that once I cleared RDI to 0, nothing else would make the executable malfunction.

I think it started to dawn on several of us at once. Supposedly, the time function doesn’t take any parameters. Or so I supposed, based on my reading. But that was wrong. The Linux time function takes one (understated) parameter: The parameter can either be 0, or it can be an address. If it’s an address, time will put the current time_t timestamp value at that address. If it’s 0, time will return the time_t value in RAX.

In stepping through the demo program’s execution in a debugger, I noticed that after a call to the puts function, register RDI would contain a memory address. It wasn’t always the same, and it wasn’t generally useful, So un-useful, in fact, that the garbage addresses being left in RDI would cause either a hang or a segmentation fault. In the x64 calling convention, the first parameter is always passed to a function in RDI. I didn’t think of time as having any parameters at all, but clearing RDI to 0 before calling time guaranteed that time would place the time_t value safely in register RAX…instead of crashing.

So the big mystery was solved. I spent an hour and a half trying to get the program to crash. As long as RDI was 0 when time was called, it did not crash. Halleluia! The big mystery was solved.

The small mystery remained: Why did some of my readers built the executable and have it work perfectly, while the exact same program on my Linux machines went belly-up? That remains an open service ticket. I’m mildly curious, but as long as I know that RDI has to be either 0 (preferably) or the address of a suitable buffer to hold the time_t value, all will be well.

Let me wrap up by abundantly thanking everyone who took part in the bug hunt:

  • My friend and SFF collaborator Jim Strickland
  • Linux expert Bill Buhler
  • New commenter Bruce
  • Long-time reader Jason Bucata
  • X64 programming expert Jonathan O’Neal
  • Contra regular Keith

You guys were brilliant. I will cite you all on the Acknowlegements page in the book, when it comes up (with some luck) next summer.

Again, thanks. In a weird but satifying way, it was fun. Now I have to get back to work.

A libc Mystery

As most of you know by now, I’m hard at work on the x64 edition of my assembly book, to be called X64 Assembly Language Step By Step. I’m working on the chapter where I discuss calling functions in libc from assembly language. The 2009 edition of the book was pure 32-bit x86. Parameters were passed to libc functions mostly by pushing them on the stack, which required cleaning up the stack after each call, etc.

Calling conventions in x64 are radically different. The first six parameters to any function are passed in registers. (More than six and you have to start pushing them on the stack.) The first parameter goes in RDI, the second in RSI, the third in RDX, and so on. When a function returns a single value, that value is passed back in RAX. This allows a lot more to be done without fooling with the stack.

Below is a short example program that makes four calls to libc functions: Two calls to puts(), a call to time, and a call to ctime. Here’s the makefile for the program:

showtime: showtime.o
        gcc showtime.o -o showtime -no-pie
showtime.o: showtime.asm 
        nasm -f elf64 -g -F dwarf showtime.asm -l showtime.lst

I’ve used this makefile for other example programs that call libc functions, and they all work. So take a look:

section .data
        timemsg db    "The timestamp is: ",0
        timebuf db    28,0   ; not useed yet
        time1   dq    0      ; time_t stored here.

section .bss

section .text

extern  time
extern  ctime
extern  puts
global  main

main:
        push rbp            ; Prolog    
        mov rbp,rsp

        mov rdi,timemsg     ; Put address of message in rdi
        call puts           ; call libc function puts
               
        xor rax,rax         ; Zero rax
        call time           ; time returns time_t value in rax        
        mov [time1],rax     ; Save time_t value to var time1
        
        mov rdi,time1       ; Copy pointer to time_t value to rdi
        call ctime          ; Returns ptr to the date string in rax

        mov rdi,rax         ; Copy pointer to string into rdi
        call puts           ; Print ctime's output string
        
        mov rsp,rbp         ; Epilog
        pop rbp
        
        ret                 ; Return from main()

Not much to it. There are four sections, not counting the prolog and epilog: The program prints an intro message using puts, then fetches the current time in time_t format, then uses ctime to convert the time_t value to the canonical human-readable format, and finally displays the date string. All done.

So what’s the problem? When the program hits the second puts call, it hangs, and I have to hit ctrl-z to break out of it. That’s peculiar enough, given how many times I’ve successfully used puts, time, and ctime in short examples.

The program assembles and links without problems, using the makefile shown above the program itself. I’ve traced it in a debugger, and all the parameters passed into the functions and their return values are as they should be. Even in a debugger, when the code calls the second instance of puts, it hangs.

Ok. Now here’s the really weird part: If you comment out one of the two puts calls (it doesn’t matter which one) the program doesn’t hang. One of the lines of text isn’t displayed but the calls to time and ctime work normally.

I’ve googled the crap out of this problem and haven’t come up with anything useful. My guess is that there’s some stack shenanigans somewhere, but all the register values look fine in the debugger, and the pointer passed back in rax by ctime does indeed point to the canonical null-terminated text string. The prolog creates the stack frame, and the epilog destroys it. My code doesn’t push anything between the prolog and epilog. All it does is make four calls into libc. It can successfully make three calls into libc…but not four.

Do you have to clean up the stack somehow after a plain vanilla x64 call into libc? That seems unlikely. And if so, why doesn’t the problem happen when the other three calls take place?

Hello, wall. Anybody got any suggestions?

The Mastodon Hunters

Well, I didn’t expect this, though I probably should have: A huge wave of former Twitter bluechecks and their followers have descended upon the Mastodon Federation, and–sunuvugun–they’ve started throwing spears at each other.

First of all, for those who have never heard of it: Mastodon is a social network modeled superficially on Twitter. It’s distributed, in that anyone can create a server instance of Mastodon, and connect to other Mastodon instances through an underlying protocol called ActivityPub. It’s very cool in its own way, and brings other (ancient) distributed social networks to mind, like Fidonet and Usenet. Within a server instance, members can post and read tweet-ish things called “toots.” Theoretically, any Mastodon instance (there about 7,000 of them) can trade traffic with any other Mastodon instance. Content moderation, codes of conduct, and control of what other instances can share traffic are entirely under the control of the members of a given instance. There is no centralized management. Each instance governs itself.

So NPR’s Adam Davidson set up a Mastodon instance called journa.host, mostly targeted at journalists fleeing Twitter. The journa.host instance now has about 1,600 members, though that number doubtless changes hourly. I’ve cruised some of the posts, and it looks a great deal like the sort of stuff we’ve always seen on Twitter: some interesting, some blather, and some complaining about the indiscretions of others. Here’s the weird part: Almost immediately, fights broke out.

Maybe that’s not weird. Maybe that’s just how social networks operate. In this case, it had repercussions: A great many Mastodon instances, told by one malcontent or another that journa.host was transphobic, decided to block journa.host entirely. If you read Twitter, look for posts by @ajaromano, a bluecheck journalist who’s been trying to figure out why journa.host is being blocked so much. There’s a threadroll here. She’s trying to pin down what makes journa.host transphobic, and so far she got nuthin. Someone linked to a transphobic NYT article? Seriously? The NYT?

What this leaves us with is basically a Twitter-flavored forum with 1,600 members, shunned by all the other major Mastodon instances. So much for having 75,000 followers.

Now, why? I seriously doubt journa.host did anything transphobic or Aja Romero would have found it by now. I think the problem is much simpler and more mundane: Longtime Mastodon users think the wave after wave of Twitter refugees are ruining the neighborhood. The federation network can’t crash, but massive activity spikes can slow things down enough so that it might as well have crashed.

I’m not sure why it should be so, but I’ve read that Mastodon leans left. So in a way it’s the perfect solution for people who hate Elon Musk enough to bail on Twitter, leaving their blue checks and their thousands of followers behind. Alas, right now it looks a lot like Mastodon’s fediverse is the Holy Roman Empire of social networks: thousands of dukedoms, city-states, and strange little scraps of intellectual backwaters and walled fiefdoms that just don’t talk to anybody else and occasionally start throwing rocks.

What happens next? Nobody’s saying it out loud, but I’ll hazard a guess: They’ll soon be back on Twitter. How soon? A month or so. We won’t know for sure because they won’t want to admit it, but Twitter is successful because it’s big. Musk will eventually figure out how to make it pay. The real interesting question is what shape the Mastodon fediverse will be in come the new year. What’s the sound of one instance banning?

Silence. Heh.

The Great 2022 Mastodon Migration

My God, you’d think the world was ending. The screaming, yowling, weeping and rolling on the floor in the wake of Twtter’s acquisition by Elon Musk is something to see. I’m interested in Twitter because for me it fills a need: quick announcements, wisecracks, indie book promo, Odd Lots-style links to things I find interesting or useful….so what’s not to like?

One thing, and one thing only: disagreement.

But that’s the viewpoint of the bluechecks, not me or most of my friends. The bluechecks are fleeing Twitter. Where to? Mastodon, mostly. Poor Mastodon. Gazillions of new users are arriving, with so little computer smarts that they can’t figure out how to use the platform. Mastodon has a lot of promise. This is their chance to make the bigtime, instead of lurking in the shadows of all the monumentally larger social networks. I’m very curious to see what they make of it. I wonder if they understand the demands that will be made of them: Forbid disagreement with…anybody I don’t like.

There was a time when disagreement was a learning opportunity. Or most of it, anyway, at least disagreement among reasonably intelligent people. But that was way back in the ’70s. As we slid into the ’80s, disagreement became insult. I avoided disagreeing with people of a certain psychology, knowing that they’d just get bright red and scream at me before I ever had a chance to make a case for my own positions.

The ’80s were the era when, little by little, I stopped going to SF conventions. Why hang out with people who’ll jump down your throat at the slightest hint of disagreement? I missed the social element of conventions, but by 1985 or 86 cons had gotten so toxic I just stopped going.

(These days I go to one con a year: Libertycon, where I know I won’t get screamed at for having ideas at odds with the bluecheck zeitgeist.)

Now, in the Groaning Twenties, disagreement is first-degree murder. Or genocide. Or maybe the heat-death of the universe. Does it bother me? No. It makes me giggle. I’ve been called a racist and a fascist and a few other more peculiar things. Like I said: I giggle. It’s all so silly. I still write subversive hard SF and program in Pascal. I am what I am. You can’t change me by screaming at me.

Why have I gone on at such length about the disagreement phenomenon? Easy: After years of being a staunchly defended echo chamber, Twitter is now trying to become a profit-making enterprise. I used to pay for CompuServe. If Twitter becomes a paid service, I will pay a (reasonable) price for a subscription. I get the impression (and admit I could be wrong; we’ll see) that Twitter will moderate people who use dirty words to denigrate other people…but won’t ban those posting links to peer-reviewed research showing that Ivermectin is an effective broad-spectrum antiviral.

That would be a tectonic change in the social media universe. It’s going to take a few years for Elon Musk to figure out how to do it. But that dude can orbit 52 telecomm satellites in one damfool rocket…I’m not willing to speculate on what he can’t do.

So. Has Twitter changed since the Great Mastodon Migration? A little. In scrolling down through my Twitter posts over the last month or so, I see a few replies have gone missing, doubtless originally posted by people who are now tooting their little hearts out over on Mastodon. With only a few exceptions, the bluechecks have very little to say that isn’t abject fury at people who disagree with them. (And to think I almost majored in journalism, sheesh.)

Musk is laying off thousands of people. The firm can either survive without them or fold. Me, I’m pretty sure the whole damned operation could be run by a thousand or so good, smart, devoted staffers. The trick is to find and motivate such staffers. I suspect Elon Musk can do it.

In the meantime, the bluechecks are fleeing. G’bye, guys! Have fun over on Mastodon! Here on Twitter we’re still having a wonderful time! (I’d say, “glad you’re not here,” but I’m too nice a guy to do that. What else could you expect from a Pascal programmer?)