Being an active contributor to technical forums I am often faced with surprisingly simple questions about basic programming environments from computer science students. Students are often lazy and why should it surprise me that so many don't bother to learn the basics, right? Maybe, but too often it is ones who are otherwise bright and not at all lazy. So what is the true problem?
Thinking back to my own basic programming classes I think I can see the problem clearly. I cannot think of one professor that explained the programming environment in class. The books tend to skip over those parts as well.
What is the number one problem students have with their Java projects?
Classpath.
Something as simple and basic as that, it's just not explained in class. Is it too simple and ought to be self-explanatory? Perhaps, but I wouldn't expect a first year CS major to grasp the concept of classpaths without at least a basic introduction, why are CS professors expecting it? Sit a student in front of a tool like eclipse and I guarantee their eyes will glaze over and a simple concept like a compiler will not occur to them on their own.. command line? Forget it.
With the development environments getting more and more complex each year and the command line being something of the past (unless you're a unix geek like me) I think CS programs should look into providing classes in basics of development environments. Using tools like source control, IDEs, etc. It would help the students understand programming better and give them a better start in the workplace.
Because of the emacs dumper which makes all sorts of assumptions it really shouldn't, compiling emacs on Fedora generates a core dump when dumping.. error looks sort of like..
Dumping under names emacs and emacs-21.3.1 make[1]: *** [emacs] Segmentation fault (core dumped)
There's a very easy way to fix this:
setarch i386
and emacs will build just fine.
Running "svn update" is really simple and speedy, but I like having a nightly cronjob updating my local copy of the code, so I wrote a script that does just that and sends me a small, quick summary of what it did, paying special attention to conflicts.
I really don't care what files have changed, were added, etc, but I do care which files have merge conflicts.. so that's what this script does.. output looks something like:
Tue Sep 21 15:40:50 2004 Updated /home/ktrapszo/test1 Conflict Detected!
Conflicted files:
/home/ktrapszo/test/test.java2 files added
13 files updated
0 files deleted
0 files mergedFinished update in 13 seconds
That's really all I care to see, if I want specific updates on specific files I can always check the daily changelog email.
So for those who work like me and like an automated update that only sends a small summary of important info, the script is here.
HP Unix (or as I like to call it h-pox and a pox it is) is not a popular platform and thanks to my recent exposure to it I completely understand why. The non-ansi C compiler it comes with is a winner as is the fact that its getpass() library only accepts 8 characters (what is this, the 80s?) as I just learned. The hard way.
So if you want to compile the Subversion client on this OS, it will only work with short passwords.. unless before configuring you let APR know not to use its own getpass() instead of the system's.
The way to do that?
Before you run configure on the source. (Adjust syntax for your favorite shell of course, I'm a tcsh gal)setenv ac_cv_func_getpass no
This tip brought to you by Joe Orton of the Subversion user's list. Thanks!
Some subversion benchmarks I've done.. mod_ntlm is an apache module and can be found here, AuthenNTLM is a perl apache module that does the same thing and can be found here.
Benchmarks were done using 1892 plain text source files. Total size of data: 25MB in svn: 66MB. Each commit was done to a freshly created repository and then a subsequent checkout of the same files to a fresh work area.
Connection over a partial T1 shared by about 20 people during regular business hours.
The strangest result seems to be mod_ntlm + ssl being faster than mod_ntlm alone.. I ran that benchmark twice during two different times of the day and the result was the same. Odd.
| method | commit (minutes) |
checkout (minutes) |
| AuthenNTLM | 12 | 5 |
| AuthenNTLM + mod_deflate | 12 | 5 |
| AuthenNTLM + ssl | 13 | 6 |
| AuthenNTLM + ssl + mod_deflate | 13 | 6 |
| mod_ntlm | 28 | 18 |
| mod_ntlm + mod_deflate | 28 | 15 |
| mod_ntlm + ssl | 25 | 16 |
| mod_ntlm + ssl + mod_deflate | 29 | 14 |
| svnserve | 6 | 4 |
| svnserve + ssh | 6 | 5 |
Note: These would obviously be affected by (a) traffic on the network and (b) how busy the svn server was at the time of the benchmark, but more or less it should show which access methods are faster than others and what is the price for more secure access.
So far one on the list that I discovered just today. When you request Subversion provides you with a diff between current version and previous revision (svn diff $file) it looks at the timestamp of the current file and the version in repository. Given the timestamps are identical, diff always returns false.. This is a problem.
I discovered this when running test conversions from our ClearCase repository.. turns out that sometimes my script is just too fast and a version of a file may have the same timestamp as the previous revision which means subversion merrily ignores the change and doesn't commit it. Ironically, this would be less of a problem on windows since windows stores milliseconds not just seconds as unix.. but that's a whole other issue. Of course this is pretty easy to fix in my script.. just touch the file with its original revision date and subversion is happy to acknowledge the changes.. but I see this as a weak point in what I thought was a pretty well thought out scm tool.
The reasoning behind it is of course performance.. it's a lot faster to determine if a file has changed if you do not have to actually compare two files.. and size isn't the most reliable indicator. Well turns out.. timestamp isn't necessarily reliable either. Now if you put the two together.. size & timestamp in a hash you have a much higher chance of being correct when making the "is the file changed" assumption. Which is what I suggested on the Subversion users list and, as everything else in life, it just isn't quite so simple -- it appears the current implementation provides no simple access to the size information of the in-repository-revision.
Given that my original suggestion isn't something easily accomplished, I hope my second suggestion of allowed the user to override this default behavior (of course, performance will suffer but reliability will improve for scripted commits) will be implemented.
After about a month of pretty heavy usage and testing of subversion this is so far my biggest issue -- one not likely to be encountered in day-to-day usage. Not bad at all, my CVS list is a lot longer. Of course.. I haven't really started on hooks and heavy scripting yet..
- Add it to services
In /etc/services:
svnserve 3690/tcp # Subversion svnserve
svnserve 3690/udp # Subversion svnserve - Create the xinetd configuration file for svnserve
In /etx/xinetd.d/svnserve:
# default: on # Subversion server service svnserve { socket_type = stream protocol = tcp user = svnadmin wait = no disable = no server = /usr/local/bin/svnserve server_args = -i port = 3690 }Of course, adjust the above for port, file location, user, etc.
- Restart xinetd service
kill -SIGUSR2 `cat /var/run/xinetd.pid`
- Verify svnserve is listening
# netstat -anp | grep LISTEN | grep 3690
tcp 0 0 0.0.0.0:3690 0.0.0.0:* LISTEN 3303/xinetd
- or -
# telnet localhost 3690 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. ( success ( 1 2 ( ANONYMOUS ) ( edit-pipeline ) ) )