robbat2: (Default)

Since I posted 15 hours ago, I've gotten another 6 spam comments from the same algorithm.

[livejournal.com profile] rachelutejy
[livejournal.com profile] wendyfidaz
[livejournal.com profile] victoriahuder
[livejournal.com profile] kellyqubym
[livejournal.com profile] devenydylom
[livejournal.com profile] hannahowyve
An interesting further trend I noticed on them: Given the same comment input body, they post the identical text, further lending credence to the concept that it's an automatic system.
Original:
Seems like I'm out of luck? This will greatly hamper anything which I want to do with the data.
Spammer:
If I wanted to store my data on the user's drive or on a server, though, I'm pretty-much out of luck.
The spammer's source for the sentence was:
http://www.gamedev.net/features/reviews/productreview.asp?productid=613
robbat2: (Default)

I'm not sure what's with the sudden wave of spammers hitting livejournal. However they are using some smarted behavior now.

They came to my attension as their spam engines are taking some comments or posts, using them as input data for google, and then using the google results for posting their comment. it's statistical noise, but it's exactly similar to the input noise, so it's beating Bayesian analysis as well.

Real:
There is, really, no difference between people stuck in poverty, unless it's a matter of degrees.
Spammer:
It's a matter of degrees, and while I think those degrees matter let's not pretend lawful companies are much better.
From a discussion about LDAP lookup failures:
Real:
There is actually a user and group for trousers. Also gentoo tends to like certain id/groups as certain UID/GID. If that group is used it will then user the next available GID/UID. Sometimes through later upgrades this becomes an issue. Usually badly written ebuilds. That is why I let the script make it.
Spammer:
Of course, not all, but many ebuilds are plainly badly written or gives bad implementations. One example, I update my OpenLDAP setup.

Yesterday:
[livejournal.com profile] gemmawasyl
[livejournal.com profile] dorottyaniqof

In a wave, over the last 3 hours:
[livejournal.com profile] amandafedij
[livejournal.com profile] bridgetuhimo
[livejournal.com profile] kaileymuseq
[livejournal.com profile] katyajuxek
[livejournal.com profile] annabellehusiq

Patterns:

They have a single post, with a large block of filler text from somewhere totally random. Followed by a bunch of links with pornish titles, and matching the following pattern:
RANDOM = appears to be foreign dictionary words
CC = belgique, espana, france, quebec, suissse
TLD = .com, except for espana, which is under .es
WORDS = title words for porn stuff, no spaces
http://${RANDOM}.i${CC}.${TLD}/${WORDS}.html

robbat2: (Default)

Ok, so this isn't a full one week period yet, but I'm going to be out tonight probably, so 8 hours ahead of time is close enough. These also don't account for anybody who went and picked a specific mirror manually. I could do a much better job, but this is just a quick scrape of the numbers. There are many pitfalls in them, so they are more for interest than serious statistics.

Downloads by bouncer product (no arch breakdown)
gentoo-2008.0-livecd (x86,amd64)72518
gentoo-2008.0-minimal26543
gentoo-2008.0-universal (hppa,ppc,sparc64)2925
gentoo-2008.0-packagecd (sparc64)385
'completed' from torrent tracker
livecd-i686-installer-2008.0-r1975
livecd-i686-installer-2008.0867
install-x86-minimal-2008.0681
livecd-amd64-installer-2008.0-r1451
livecd-amd64-installer-2008.0373
install-amd64-minimal-2008.0353
install-powerpc-universal-2008.069
install-powerpc-minimal-2008.061
install-alpha-minimal-2008.048
install-ia64-minimal-2008.046
install-hppa-universal-2008.042
install-sparc64-universal-2008.029
packages-sparc64-2008.028
install-sparc64-minimal-2008.028
install-hppa-minimal-2008.028
www node traffic

For the two machines that serve up exclusively the www.gentoo.org vhost, they normally do 6-9GiB/day in HTTP traffic, and on the day of the release they jumped to 21GiB (and 14GiB for the second day).

robbat2: (Default)

So on Slashdot today, there was a link to the latest research into Package manager security. Specifically, their focus was on defeating signed packages by use of malicious mirrors and replay attacks of signed content. Recording the source of client requests, and possibly denying specific security updates (having an older tree that doesn't contain the security updates).

This plays into some of my long-ongoing tree-signing research in Gentoo. The GLEPs with the exception of 02 and 03 have been mailed to the GLEP editors as well as the portage-dev mailing list, and will be going to the gentoo-dev mailing list after the GLEP editors have reviewed them.

For dealing with the new issues raised by Cappos et al, at Gentoo we are really lucky to have our own infra maintained hardened rotation of mirrors at rsync://rsync.gentoo.org/ in addition to the community mirrors at rsync://rsync$N.$CC.gentoo.org/. Nobody using just the infra-maintained mirrors (barring MITM attacks) would be vulnerable to the new attacks described by Cappos, however those using a community-maintained mirror could be.

Using the main mirrors for new signing purposes, this will enable us to deliver the new MetaManifests reliably via our own infrastructure, even when the user has a community mirror for their actual tree content. The actual changes to the GLEP for this weren't very big at all. Just a timestamp header inside the signed area, as well as distributing the MetaManifests via a trusted medium.

As a minor side note on the infra-maintained rsync.gentoo.org rotation, this would be a good time to consider sponsering a box to Gentoo for that purpose. Each of the 5 existing boxes in the rotation does 50-65GiB of traffic every day - averaging to 6.5Mbit/sec, over a 24-hour period. These boxes are bandwidth, memory and CPU intensive, however they don't hit disk very hard (we serve the trees directly from memory). 4GiB RAM, 2+ 64-bit processors (single core or dual core is fine), ~16GiB of disk (optional: software RAID1 is nice for avoiding downtime, and fancy fast disks aren't needed). We need a serial console or KVM to install it securely - you just boot the box to a livecd, get the access details to infra, we install it from there with our own stage4 tarball that links into cfengine. The machine continues to be owned by the sponsor, in your data centre.

robbat2: (Default)

So working on a cleanup of machines, I looked at the history of the infra pages in CVS, and I noticed that infra has had a lot of developers of the years.

I'm probably missing a few, that never made it to the list, or predated the list, but I think it's a good start. I've also listed what they did either from the webpage, or from memory, again apologies if I got it wrong.

I'd like to thank all of those that put work into infra in the past, but have retired from Gentoo

  • Alex Howells (astinus) - Mirrors, DNS.
  • Jeffrey Forman (jforman) - Mirrors, DNS, Bugzilla, sysadmin, lots
  • Andrea Barisani (lcars) - Lists, LDAP, mail, sysadmin, lots
  • Kyle England (kengland) - sysadmin, cfengine
  • Lars Weiler (pylon) - CVS, SVN, overlays
  • Robert Coie (rac) - Forums, DBA
  • Jon Portnoy (avenj) - Mirrors
  • Sascha Schwabbauer (cybersystem) - Mail, Jabber
  • Tim Haynes (piglet) - Mirrors
  • Corey Shields (cshields) - LOTS
  • Rob Holland (tigger) - sysadmin
  • Benjamin Coles (sj7trunks) - Bugzilla, sysadmin
  • Michael Cummings (mcummings) - sysadmin
  • David Olsen (lude) - Mirrors
  • Albert Hopkins (marduk) - packages.g.o
  • Luca Mercuri (siggy) - www
  • Andrew D. Fant (jfmuggs) - backups, www
  • Curtis Napier (curtis119) - www, torrents
  • ???? (little_bob) - nagios

I'm not forgetting our current infra team, I hope to do a followup about them sometime soon too.

robbat2: (Default)

Up until recently, I had thought most Gentoo users and developers to be adults, who made sensible choices in their actions (but not always their words). This may be generalized to acting professionally. I am saddened to report on the ongoing degradation of the community in this regard, and how infra will deal with their side of it.

I've been with the infrastructure team in general for a very long time, however, up until April 2007, I was only the CVS administrator, and had no roles nor access outside of that. Since then, I stepped in as an extra sysadmin, and I've ended up as one of the operational leads, which still means I do most of the work, I just get to make the choices about it too. While the 'old' infra were in some cases called tyrants, dictators, cabals, and other nasty things, we as the 'new infra' hoped to change this view.

We're charged with a lot for developers and users: procuring machines to run them on, maintaining them, developing new services, troubling some user and developer issues (eg: cvs/mirrors) and more.

For myself, in addition to the CVS/SVN/Git services that grew out of my CVS administration, I presently maintain LDAP, Lists and Bugzilla. I have also been the infra liaison to the releng team since 2007.0.

The various VCS and LDAP services are only of primary concern to developers, because extremely few users interact directly with them. However, Bugzilla and Lists are used by significantly more users than developers, and the interactions show.

All messages to mailing lists with 'unsubscribe' in the subject line get moderation and passed to me, and a great many of them are in the realm of blunt and abusive - usually on generic-sounding email accounts that have changed ownership to clueless people. There's also the fun of keeping the spam off (see my recent post to the mlmmj list, of which I should possibly blog about). That's the mundane side. There's also moderation of the actual moderated announcement lists, and tracing mis-delivered list bouncemail as it gets reported. Lastly, and perhaps most important to some, we are held accountable to userrel and devrel for enacting list bans.

Bugzilla gets less direct abuse, however when it happens, it's usually quite flagrant. jakub used to complain to me once or twice a month about users refusing to take no for an answer, and repeatedly filing duplicates, or deleting entire CC lists, or spamming a bug. Since his absence, I've caught less of these early on, simply because he basically read every bug that was filed, and I don't have the time for that (yes, I'd like him back, he did a good job). Bots that ignore robots.txt are a hassle, but are mostly manageable.

For developer issues, we haven't been offering executable homedirs for several years, since some former developers tried running BOINC, and various servers. It seems however that there has never been any codified warning, merely action on a case-by-case basis.

As of today, we're formalizing the handling of this. All infra-maintained machines either already, or will shortly have an AUP banner as follows:

 Any or all uses of this system and all files on this system may be
 intercepted, monitored, recorded, copied, audited, inspected, and
 disclosed to authorized site personnel, as well as authorized officials
 of federal law enforcement agencies, both domestic and foreign. By 
 using this system, the user consents to such interception, monitoring,
 recording, copying, auditing, inspection, and disclosure at the
 discretion of authorized site personnel. Use of this system constitutes
 consent to security monitoring and testing. All activity is logged with
 your host name and IP address. Unauthorized or improper use of this
 system may result in civil and criminal penalties. By continuing to use
 this system you indicate your awareness of and consent to these terms
 and conditions of use. -- Gentoo Linux Infrastructure Admins.

To make it more concise without the legalese: If you abuse a Gentoo infrastructure system, we have no compunctions about kicking your ass and handing you to the suitable authorities (userrel, devrel, $GOV_AUTHORITY).

What does this not mean? Aside from being proactive about patching security issues, we are not intended, nor do have no plans to target people that some of our group don't get along with - we're meant to be accountable and responsible to other authorities in Gentoo. We'll collect the evidence (logging) and execute you (retirement), but somebody else (devrel) gets to sentence you - the only exceptions to this are preemptive actions where we consider security to be at risk.

On the matter of logging, we aren't the Stasi either, we have far better things to do than babysit logs, and we've been logging a lot longer than I was ever even a Gentoo developer. Some former developers and infra folk automated the log analysis, so the only time we really need to look is when something has been brought to our direct attention and needs logs to back it up. The most common uses for the logs are finding abusive users and bots against rsync and bugzilla, plus doing audits after (in)security events.

robbat2: (Default)

Marissa is away on her SCA roadtrip from tomorrow, June 27th, until July 7th. So rather than just cook for myself (which sucks), or going hungry every night, for my friendsbase in the Lower Mainland, consider me as available to either join you for dinner (cooking or going out), or join me for my cooking.

Many folk will inform you that my culinary skills are excellent, including [livejournal.com profile] amethest, [livejournal.com profile] momiji12, [livejournal.com profile] galaxychild, plus lots more folk that aren't on LiveJournal.

This is also to keep myself from going insane while sitting at home all day working. By extension, inviting me to parties, or for hanging out, etc. is also cool.

robbat2: (Default)

Ok, so it seems that I'm blogging again a bit, but only about software bugs and treachery. Today's post is about how I've burnt about 6 hours of development time, working through what seemed to be a simple bug.

Some background first. I'd like to switch away from an older system that's presently only used as a display head, with dual 20" LCDs on an Nvidia card. It's too old and limited to run a lot of things on directly. The new system would be both a display head and runs most apps already is a quad G5, with 12GiB of RAM. The great holdup has been in graphics. While I have access to both the stock GeForce 6600 NV43 that shipped with the machine, and an ATI X1900 G5 edition that I purchased later, my luck with graphics drivers has been less than stellar. My choices are between Nouveau, of which this tale ensues, and the competing free ATI drivers (which, at my last testing, were both still stuck, unable to read the AtomBIOS from the G5 card).

Some months ago, I filed a bug for Nouveau, that the first display output worked perfectly, but the second did not. It sat for about a month, before I got a response to go and try again. I didn't get to it until yesterday, because I was away in South Africa for two weeks, and busy with a myriad other things.

Update to the latest Nouveau and x11-drm trees yesterday afternon, and I find that it no longer even starts up a single display now. The debugging thus begins.

  1. The Device Control Block datastructure seems to have a bad data signature.
  2. Trace it. This is because the pointer to it is byteswapped.
  3. Hack in a byteswap. The signature is also byteswapped, something more central is wrong.
  4. The functions for "le{16,32}_to_cpu" seem to be broken. Just force them to byteswap for now.
  5. Update the nouveau bug again.
  6. Now we get an -ENOMEM from a memory allocation seemingly, but digging deeper, it's actually from ioctl.
  7. Update the nouveau bug again. Give up for the night.
  8. Update the x11-drm Git tree in the morning, and see that there's a fix for the ioctl stuff. Rebuild with it, and find that X will now start, but...
  9. The colors are badly swapped too! Red and Green are exchanged, and everything white is in a horrible shade of yellow.
  10. Try to initialize the second screen. "xrandr --display DVI-D-1 --right-of DVI-D-0". The taskbar expands as if it was covering both screens, but the second display is still not actually enabled.
  11. Update the nouveau bug again. Go and do other stuff.
  12. Start prodding into the nouveau driver source for the third time, looking to see about color issues. Mention this in the IRC channel. pq mentions to check that actual X_BYTE_ORDER macro.
  13. Doing a quick C program gives the correct output, however during the Nouveau compile, it's defined to _X_BYTE_ORDER (with a leading underscore), and THAT isn't defined.
  14. Look at the FreeDesktop.org bugzilla again, locate a bug for the new issue.
  15. Leave a comment with some useful output on the bug, and then set out to trace it myself.
  16. Revert a Gentoo patch that removes _X_BYTE_ORDER (with the underscore) from xorg-server's configure.ac.
  17. Notice that _X_BYTE_ORDER is being defined to an EMPTY string now. That is NOT right.
  18. Start reading the rest of configure.ac. _X_BYTE_ORDER should have the value of "$ENDIAN".
  19. $ENDIAN is in turn defined by AC_C_BIGENDIAN([ENDIAN="X_BIG_ENDIAN"], [ENDIAN="X_LITTLE_ENDIAN"]).
  20. Absolutely great. So what the hell is the problem? Well, on powerpc, the macro returns "universal".
  21. Just what the hell is "universal"??? Well it seems that in autoconf-2.6.2, the autoconf maintainers made AC_C_BIGENDIAN take another input.
  22. AC_C_BIGENDIAN([ACTION-IF-TRUE], [ACTION-IF-FALSE], [ACTION-IF-UNKNOWN], [ACTION-IF-UNIVERSAL]).
  23. So where does leave us? Up the creek with a broken autoconf I think. I haven't figured out why autoconf is telling us that we are universal yet, beyond that it's designed for the OSX universal binaries.

Hopefully I'm not tromping off into the depths of Mordor (aka the hell of autoconf M4 as maintained by vapier and flameeyes.

robbat2: (Default)

I wanted to document how to deal with an annoying corner case with git-submodules here. This applies in two variations:

  • Changing the URL of a submodule to another tree with different ids (eg, two different git-svn conversions of the same SVN project).
  • Converting an entire existing directory to a submodule.

If you apply the changes to all of your branches right when you make them, they are easy to do, but if you have old rotting branches, and you haven't kept them up to date, you can run into fun errors like the following:

fatal: cannot read object c1a25b84627516da46b6c375f4dc874deedbb597 'vendor/plugins/rspec~a4c30e94d52232e958d4f53c6a633ed438c54bcc': It is a submodule!

"vendor/plugins/rspec" already existed as a directory in this case. So how to do we work with this problem?

  1. Look at the "git log ${PARENT_BRANCH} .gitmodules". Every time there was a change to the file, make a temporary tag. I suggest using tags with date+time in the name, as you will be making more tags in a bit. You should do this on the branch that is the parent of the branch you want to fix, to cut down on conflicts.
  2. Next, identify spots that directories were deleted immediately prior to conversion to submodules, or between submodule URLs. Also tag these.
  3. Edit your .git/config, and comment out ALL submodule references.
  4. Now one at a time, issue "git pull . tags/${TAGNAME}", for each of your temporary tags, in order.
  5. If you hit conflicts, fix them up as usual (edit files, "git-update-index ${FILES}", "git commit -F .git/MERGE_MSG")
  6. If you think you messed up, use "git reset --hard ${COMMITISH} && git clean -f" with a known good point to reset you.
  7. Remember to clean up your temporary tags afterwards.
  8. Do "git submodule init" again.

I don't see why Git couldn't be made to do this automatically, since the process is reasonable simple, if a bit long to do by hand.

robbat2: (Default)
  • 10:15 Going to lunch w/ Jack #
Automatically shipped by LoudTwitter
robbat2: (Default)
  • 08:41 On ground @ SFO #
  • 10:06 On caltrain 236. Eta @ SJ-Diridon 11h00. #
Automatically shipped by LoudTwitter
robbat2: (Default)
  • 11:47 Traffic on the bus sucks #
Automatically shipped by LoudTwitter
robbat2: (Default)
  • 18:34 Mid june and 8C weather? Wtf! #
Automatically shipped by LoudTwitter
robbat2: (Default)
  • 23:28 Boarding flight CX888 to YVR #
Automatically shipped by LoudTwitter
robbat2: (Default)
  • 03:16 Now on plane back to HKG #
Automatically shipped by LoudTwitter

LoudTwitter

May. 5th, 2008 06:03 pm
robbat2: (Default)
  • 14:52 On my way to africa now. Expect tweets for 2 weeks. #
Automatically shipped by LoudTwitter
robbat2: (Default)
  • 23:54 One of the new SFU-ZU ads on skytrain has chani+pete. #
  • 23:58 Techbc/siat alumni event good. "rilli" looks v.promising. IsoHunt well recieved, ditto techbc.ca plans #
Automatically shipped by LoudTwitter
robbat2: (Default)
  • 11:40 Bah SSSS on me #
  • 11:53 SSSS faster than normal line #
  • 13:39 On plane now #
Automatically shipped by LoudTwitter
robbat2: (Default)
I need to write a really detailed blogpost later, however one of the odder moments, was a Gentoo user from the LA area calling me "a walking manpage". To dissect, this is not correct, a directory of manpages perhaps, but not a singular manpage.
robbat2: (Default)
  • 06:28 Monty's vodka, while good (hangover free), i suspect is to blame for utterly weird dreams. #
  • 11:10 No more conf now, maybe gentoo dinner tonight. #
Automatically shipped by LoudTwitter

May 2017

S M T W T F S
 123456
78910111213
141516171819 20
21222324252627
28293031   

Syndicate

RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags