Musings on the Dogwatch

Monday, December 19, 2011

Git advantages in the corporate environment

I've come across a couple of blogs and message board posts (some of them admittedly dated) lately which insist git is not usable in a corporate environment because:

• git is not centralized.

Corporate environments MUST HAVE a centralized solution for backup purposes

• git does not have canonical revision numbers.

Corporate environments must have canonical version numbers.

The Backup Issue

The first point is of course, invalid. Git can be used in a centralized manner, making all the backup monkeys happy. Used in this manner, it's not different from subversion in that the repository exists on a server and it can be backed up.

However, that ignores a far bigger point. What if your IT department isn't quite as awesome as they think they are? What if you were told about the crack sysadmin team, only to learn the system admin is low-rate overseas hourly worker? What if he didn't quite get that backup job done before the bus came? What if he hasn't checked if the script he wrote 4 months ago is still working?

One night your server stops responding, and after opening a case with IT, you learn to your horror that they're unable to restore the backup.

Not to worry, right? Every developer on the team has a copy of your subversion repository, right? Wrong. They have a single checkout of repository. One version. Need to revert to a previous version? Can't do it. Need to look at the changes involved in a particular bug fix? Not happening. Need to review the commit messages for a particular commit? Sorry, the entire log is gone.

This differs markedly from the git world. In the git world, should your centralized repository be lost, every single developer has a copy of the entire repository. It is possible that if a developer hasn't updated between the time another developer pushed some commits and the central repository was lost, he might be a couple commits behind. This is true in the subversion world too. If developers hadn't updated before the repository was lost, only the last person who checked in will have the last few commits.

It can't be stressed enough how much better your situation is with git. Think about it. With subversion, you'd figure out who had the up-to-date copy of the repository. Then what? You can't find out what revision your teammate had and send him a patch to bring his copy of the project up to date, since subversion can't generate a diff without access to the lost central repository. I suppose the most up-to-date developer could put his copy on a file share somewhere and developers could manually diff the files and apply any changes. Yeah, that's it, what fun.

The other choice of course is just to take the most up-to-date developer's project and use it to create a new subversion repository. That will work and the team will be able to check out an up-to-date copy of the project. An up-to-date copy with no history, no previous versions, nothing. Certainly not what one would consider "enterprise-level source code management."

Oh, but it also gets better. What if your team's discipline is really lacking, and nobody has a version checked out that actually works? You're hosed - you can't revert to a known working version - you have no central repository anymore.

Now let's look at the git situation. Like with subversion, the team is likely to figure out who has the most recent copy of the repository. With one command he can generate patch files that his colleagues can run to bring their repositories up-to-date before any kind of centralized server is restored. Depending on the size of the team, team members could be completely up-to-date, sharing changes again and productive again literally within a few minutes.

Oh and those patch files? Not only do they update the code to the correct state, but they also include the log messages, commit hash, etc. So when the patches are applied, the resulting repository looks EXACTLY like the repository from which it came.

The Revision Number Issue

I don't know who came up with this gem, but talk about a ridiculous argument. Someone tell me what a canonical version number is? Is it sequential? Is it unique? Is it alphanumeric? Is it octal?

Although subversion has simple version numbers, they're not particuarly useful. Sure, I can tell my team member - "Hey, create that branch off revision 2142 of trunk," but that doesn't do him any good other than pointing to a version.

And while you would think you could figure out relationships between revisions using the revision number, it's actually difficult to do so, since revision numbers are global and increment when other developers check in, or even when some checks in on another branch. So, think that revision 10 on trunk is the result of 5 commits on trunk since you checked in revision 5? Could be, but it also could be that someone checked in 4 revisions to a branch, so that revision 10 actually follows revision 5 in trunk. Not much value there.

Now, git has those super-complicated hex-looking numbers. I mean, really. We as developers are supposed to be able to deal with these hex strings? How ridiculous is that?First of all, let's get the easy part out of the way. Those big hex numbers are not difficult to deal with because we can shorten them dramatically.

In small repositories, four characters is sufficient. In large repositories, 4, 5, or 6 characters is certainly going to be sufficient. No more difficult than typing version "21042." And of course we can use all sorts of symbolic names and expressions such as HEAD, HEAD^ (the parent of HEAD), time-based specifiers and others.

More importantly however, the commit is a hash of the branch at that particular time. What this means is that you can GUARANTEE that code with the same hash is exactly the same.

Here's how cool this is:

Let's say for a minute that the repository that went down was a public project that my company exposes to the outside world and allows others to clone. And now let's say that somehow, some way, I lost the last 5 commits out of that repository. I have a copy of the log, so I know what the commits were, but I already checked with everyone in my company and nobody has a copy of those commits.Oh man, I'm in big trouble. I don't have a trustworthy source.

Can I really just reach out to the some random 12-year-old on the Internet that has a copy of the repository and ask for a patch containing those commits? Maybe I should find someone that works for another large company. Surely that's a good way to make sure they're trustworthy. Or maybe I can get my company to pay for a background check on someone.

The fact of the matter is, if you have the hash, you need not worry. The code could come from the most notorious hacker group on the planet and as long as the hashes match, I know they did not change one single bit in my code.

Conclusion

Wow. That's a lot more than I intended to write. But hopefully you can see now why such arguments are not only ridiculous, but just plain wrong.

For the Mercurial fans out there, readers should note that I'm willing to bet Mercurial has the same advantages as git in these areas. Both are excellent DVCS systems and I wouldn't hesitate to use either.

Finally, I do recognize that there are situations for which a DVCS may not be appropriate. If you have terabytes of data in your repository, git or hg is probably not for you without a lot of repository re-organization. In those cases, by all means stick with your centralized VCS. Just realize that because you have something preventing you from using a DVCS doesn't mean the DVCS advantages don't still exist.

Monday, October 24, 2011

SQLite2 source and errors

Every once in a while, I come across a SQLITE2 database that needs to be converted to SQLITE3 format.

Doing so is easy. You just dump out of SQLITE2, and import into SQLITE3, such as:

sqlite2 mysqlite2db.db .dump > sqlite2_mysqlite2db.dump sqlite3 mysqlite3db.db < sqlite2_mysqlite2db.dump

One problem is though, SQLITE2 is often not available anymore through any of the standard Unix package management systems. So if you have a newer machine with an app that's utilizing a SQLITE2 database and you want to upgrade that app to a newer version that uses SQLITE3, it's tough to get SQLITE2 installed so you can do the conversion.

What I did is download the SQLITE2 source code from:

http://www.sqlite.org/sqlite-source-2_8_17.zip

Then I compiled. (Remove the tclsqlite.c file. It's not needed)

This is on Debian Linux, BTW.

gcc -o sqlite2 *.c

That worked and got me a sqlite2 executable. However, when I tried to use it on my SQLITE2 database, I got:

sqlite2: btree.c:702: sqliteBtreeOpen: Assertion `sizeof(ptr)==sizeof(char*)' failed.

Hmm...what the heck?

I didn't carefully read the error and instead quickly went to googling. However, I didn't find much in the way of help. However, it quickly struck me that this was a 64-bit problem. Aha! I had forgotten to compile the program with the 32-bit flag. No problem.

(On OSX the flag would be -arch i386)

gcc -m32 -o sqlite2 *.c

But that gave me:

/usr/include/gnu/stubs.h:7:27: error: gnu/stubs-32.h: No such file or directory

After a little googling on that, I discovered I needed to install the lib6c-dev-i386 package, since I was running on an AMD 64-bit system. I installed the package, recompiled and everything is working fine.

Wednesday, September 7, 2011

Installing Ruby 1.8.7p352 under rvm

I was recently trying to install Ruby under RVM on OS X, 10.6.8, running in 32-bit mode.

For Ruby 1.9.2 p290, it was no problem. rvm installed it without issue (using x86_64).

However, Ruby 1.8.7 p352 would fail constantly, with:

making ruby
/usr/bin/gcc-4.2 -arch i386 -arch x86_64 -g -Os -pipe -no-cpp-precomp  -fno-common -pipe -fno-common    -DRUBY_EXPORT  -L. -arch i386 -arch x86_64 -bind_at_load   main.o  -lruby -ldl -lobjc   -o ruby
ld: warning: ignoring file ./libruby.dylib, file was built for unsupported file format which is not the architecture being linked (i386)
Undefined symbols for architecture i386:
  "_ruby_init_stack", referenced from:
      _main in main.o
  "_ruby_init", referenced from:
      _main in main.o
  "_ruby_options", referenced from:
      _main in main.o
  "_ruby_run", referenced from:
      _main in main.o
ld: symbol(s) not found for architecture i386
collect2: ld returned 1 exit status
lipo: can't open input file: /var/folders/rp/rprVpkCiGbKvS3sirOXX9k+++TI/-Tmp-//ccoGzQBh.out (No such file or directory)
make[1]: *** [ruby] Error 1
make: *** [all] Error 2

You can see above that files are getting correctly compiled with both the -i386 and -x86_64 -arch flags. That was confusing. I verified that the Makefile included the arch flags on both the CFLAGS and LDFLAGS variable.

I traced the problem to the following line of output:

cc -dynamiclib -undefined suppress -flat_namespace -install_name
 /Users/wwilliam/.rvm/rubies/ruby-1.8.7-p352/lib/libruby.dylib 
-current_version 1.8.7 -compatibility_version 1.8   array.o bignum.o 
class.o compar.o dir.o dln.o enum.o enumerator.o error.o eval.o file.o 
gc.o hash.o inits.o io.o marshal.o math.o numeric.o object.o pack.o 
parse.o process.o prec.o random.o range.o re.o regex.o ruby.o signal.o 
sprintf.o st.o string.o struct.o time.o util.o variable.o version.o  
dmyext.o  -o libruby.1.8.7.dylib

As you can see, the dynamic library is being linked here without any architecture flags. What needed to be set is the LDSHARED variable.

My first attempt was to modify the Makefile by hand, adding the -arch flags to LDSHARED. That worked - somewhat. I was able to run make and get everything compiled. However, that didn't solve the problem completely, because whenever I tried to install via rvm again, my changes were overwritten since rvm ran configure again.

Naturally, for my next attempt, I did a "export LDSHARED="-arch i386 -arch x86_64" in the terminal in which I was running rvm install. That also did not work. I checked and config.log DID show my additions to LDSHARED. Hmmm...WTH?

It turns out that the configure.in defines LDSHARED explicity for each machine type and does not use the value that configure picks up.

For darwin, it was:

        darwin*)        : ${LDSHARED='cc -dynamic -bundle -undefined suppress -flat_namespace'}

So the fix in this case, was to change it to:

        darwin*)        : ${LDSHARED='cc -arch i386 -arch x86_64 -dynamic -bundle -undefined suppress -flat_namespace'}

I did submit bug #5295 against Ruby 1.8 for this issue. Whether it will get fixed or not, I don't know.

Saturday, June 4, 2011

Update on VATSIM work

I realize I haven't posted on this blog in many months, but I thought I'd do a short update just to talk about the things I've been working on.

Most of my work recently has gone into the VATSIM FSD. None of the changes have been earth-shattering, but they have involved things like updating the FSD to work with sqlite3, changing its build system to Cmake and improving its method of locating its configuration files. None of those changes will be noticed by VATSIM users, but they make life easier for those of us who work on the code and support the servers.

For those of you waiting with bated breath on the upper-layer wind improvements I've been working on, the bad news is it's still not finished. The good news is I've started working on it again. I'm currently in the process of changing the backend database from MySQL to Postgres. Once that's complete, I'll be able to go back to finishing up the wind management code.

Thursday, July 22, 2010

Converting Bazaar Repositories to Git

I have enjoyed using Bazaar. Its flexibility is tremendous. I've always used whatever SCM system I was using from the command-line and avoided turning on integration in the IDE since it seemed to cause more problems for very little benefit.

However, after watching some of the Xcode 4.0 WWDC videos, I've realized the writing is on the wall. Xcode integration of svn and git will now provide useful benefits. I also realize that as the third most popular DVCS, Bazaar is unlikely to ever see support in Xcode.

So, with that in mind, I've decided to switch from Bazaar to Git (unless something goes horribly wrong).

The positive news is that so far, the conversion has been about as painless as it could possibly be. Certainly far more painless than the conversion from CVS to Bazaar.

To convert from Bazaar to Git involved the following steps:

1) Install the Bazaar fastimport module

2) Install git

3) Create a git repository into which to convert

git init git.repo - (use --bare if it's going to be on a central server)

cd git.repo

4) Run a command to export the bazaar repository, piping the result to a command to import into git

In this example:

~/bzr_repo - the bazaar repository to convert

~/git.repo - the new git repository

(command is being run from the git.repo directory)

bzr fast-export --export-marks=../marks.bzr ../bzr_repo/ | git fast-import --export-marks=../marks.git

That's it. You're now done. You can now clone that repository and go to work.

Tuesday, July 6, 2010

Doctors' payments

Does anyone else hate trying to pay their doctor as much as I do?

First, you can never figure out how much you owe them. Typically, a huge bill comes first. But you know you don't owe that much, since insurance is going to cover the majority.

So you wait for your next statement to see what you really owe. But now you're past-due. And woe be to you if you visit the doctor more than once in 30-days; you'll be lost in an impossible maze of charges and insurance reimbursements.

Ok, fine. My doctor sent me a "please pay now" notice, so I'm pretty sure this is what I actually owe. No problem. While I'm sitting here paying bills online, I'll just quickly go to their website and pay online.

Except that they don't accept online payment. Are you freaking kidding me? An industry that has massive trouble collecting payments does not provide a way for patients to pay online? Yep, that's right. Ninety-nine percent of doctors have no way for you to pay online.

I could pay with my regular online bill payment service. However, it seems like I have about 12 different accounts associated with the doctors we see. So it's awfully inconvenient to have 12 different payees in my payment service.

Fine. I'll call them. Oh wait - it's 11 pm and an industry which has trouble collecting payments only has its billing office open from 9-5 on weekdays.

Fine. I wait until tomorrow. I call the business office during their advertised hours and listen on hold for 20 minutes while the voice response system tells me my call is important. After 20 minutes, it routes me to a voice mail box. I leave a voice mail. Four hours later, after receiving no call back, I call again. 20 minutes later I leave another voice mail. Four hours later after no call back, I call again. Twenty minutes of repeated "your call is important to us" later, the system tries to transfer me to an operator, but it hangs up instead.

The business office is now closed for the day. To make a long story a little shorter, after two more attempts the next day, I finally get through and pay my bill over the phone with a debit card.

In short, absolutely unacceptable.

Yes, I realize I could just write a check, get a stamp, put it in the envelope and mail it. However, that's not the point. The point is, I want to pay my bill NOW. Online. And you're not letting me do it.

No wonder you have problems getting paid.

Friday, June 25, 2010

Status of the Mac client

As some of you may have guessed by my lack of posts, the Mac client has basically been stalled. My real-world job has been absolutely crazy, summer means all sorts of kids' activities and I've picked up additional responsibilities at VATSIM.

I have no intentions of abandoning the project though. In fact, I'm considering adding some additional team members to help speed things up. Things I'd likely farm out would be the things that can be easily compartmentalized like alias and POF file handling, flight strips, etc.

I haven't made a final decision yet, but if you know Objective-C and want to help, let me know.