Braggtown dot com

A Tangled Web: Archive

Archive for the ‘Work’ Category

 Passion Quilt Meme - information empowers

Thursday, May 15th, 2008

Brandi tagged me with the Passion Quilt Meme- “Take/make a photo and caption it with a statement that you feel passionate for children, students, libraries…”

Here’s mine:

information empowers

I’m a digital librarian and love technology, but in the end, it’s about people and freedom.  It’s about empowering people and leveling the field.  That’s one of the reasons I wanted to become a librarian.  I spend a lot of time at a computer, often working in isolation from users as my project isn’t a user-focused endeavor.  Still, it’s the people that are important.

This photo was taken at the Katherine Dunham Center in East St. Louis.  My wife, Brandi, is in the backrow.  Her class, LIS 451 Introduction to Networked Information Systems, build computers and a a lab for, and provided instruction to, neighborhood kids in one of the most imporvershed areas of the country.  Although my job here doesn’t look much like my job there, it’s still about empowering people.

 Data Transfer

Tuesday, April 22nd, 2008

My next objective at work is to transfer some data to Library of Congress. I have the option of pushing data up the Abilene Network and will probably give network transfer a try, but the bulk of the data will move in two 1TB USB drives via Fed Ex. I’ll load them up and ship them to LC where they unload them and ship them back. Back and forth until they get everything.

I have an 11 page document specifying the organization, naming conventions, etc,. The most important point is that I need to create a UTF-8 text file at top-level directory containing the path and checksum (MD5 or SHA-1) for each file.

So, I have 5 1TB ZFS slices sitting in a storage array in our server room. Here’s the df -h:

Filesystem Size Used Avail Use% Mounted on
/storage/ndiipp1 977G 975G 1.9G 100% /storage/ndiipp1
/storage/ndiipp2 977G 972G 5.2G 100% /storage/ndiipp2
/storage/ndiipp3 977G 804G 174G 83% /storage/ndiipp3
/storage/ndiipp4 977G 826G 152G 85% /storage/ndiipp4
/storage/ndiipp5 977G 820G 158G 84% /storage/ndiipp5

There are a couple of issues I need to think through. Hopefully I have enough free space after formatting (haven’t decided on a file system to use on USB drives) to perform a 1:1 copy from partition to USB drive. Of course I still need room for my UTF-8 manifest file. It seems like I’ll have space.

We’ve got 4GFC Firbre Channel switches in our storage area network, but I’ve only got a 100BASE-T LAN connection to my workstation. I’m very curious to find out how long it will take to both transfer the data from the SAN to the USB drive (probably using tar over SSH) and how long it will take to checksum the up to 85,000 file in each partition. I’m sure I’ll be glad I kept my old Xeon workstation to chew data. I think I’ll look around for a utility to grab some network statistics like collisions and resent packets.

Luckily, this may be network, processor, and time intensive, but it’s pretty automation friendly. That’ll give me some time to figure out why Fedora doesn’t seem to want to deploy properly in Sun Java App Server. Then I can start mapping data models between the DSpace and Fedora repositories.

 Samba on Hardy redux

Wednesday, April 16th, 2008

Hardy smbfs is borked.  Actually, I understand that smbmount has been adandoned.  CIFS, the samba replacement in Hardy, is busted.  All hail Ubuntuforums.org!  Beta OS != perfect, right?

 Samba in Ubuntu 8.04 Hardy

Tuesday, April 15th, 2008

It seems there was a change in the Samba package between Ubuntu 7.10 and 8.04.  I was getting an error while trying to connect to some Solaris shares.
$ smbclient -L //web -I 192.1.168.0 -U user%password
Server requested plaintext password but 'client use plaintext auth' is disabled

It seems that adding the following to your /etc/samba/smb.conf file solves the problem:

client plaintext auth = yes
client lanman auth = yes

It took awhile to realize that just setting plaintext auth to true wasn’t enough. lanman auth overrides it. Should have read the man page more closely, I guess.

 Portland POV

Tuesday, February 26th, 2008

I got to Portland for Code4Lib early today and had a chance to explore the city with some of the usual suspects (UCSD and UNT Denton).  I learned a couple of things.  First, Rogue chocolate stout is awesome.  Second, I should replace the mess of php and javascript at work with Django.  As a bonus, I learned that packing a few terabytes of data into the trunk of a state car and driving it to DC isn’t necesarily a bad idea.

 Today

Sunday, February 17th, 2008

today's picture: it's a secret

Actually, ‘today’ was in late October. I just noticed this draft and thought I’d push it out. It didn’t work out, obviously.

 Systems Tasks

Sunday, February 10th, 2008

A couple of weeks ago I decided I needed to retool some of the NCGDAP data processing tools I wrote when I started at NC State in 2005. For awhile I’d been using subversion, but fell out of the habit. I was pretty confused to find that I had at least 4 versions of everything I’d written and no idea which was latest, which features I’d already incorporated, or (embarrassingly) what everything did. I’d clearly been shirking some system administration duties.

After some time spent with diff and a text editor, I was down to one version of each application. I also spent some time trying to make sure I didn’t have to do it all again. I sometimes work on my Thinkpad at home, at conferences, and on the bus, which eliminates keeping things solely on a network somewhere. Knowing that I sometimes forget which files have been modified, I wrote a bi-directional rsync over ssh process to sync my Thinkpad with my desktop and can run it from an icon on my Gnome panel.

I also wrote a nightly cron job to backup my work desktop to a network drive. The NCGDAP applications reside on the data processing server so I Samba mount that directory at boot. I installed and configured network-manager-vnc finally. It was ridiculously easy compared to vpnc, which never worked correctly. At home, I configured sshfs mounting of my work desktop from my home desktop so I never have to make a local copy of anything to work on it.

Last by not least, I finally got around to installing Cygwin on Brandi’s Windows XP laptop. Now, she can click an icon in her start menu that starts an rsync over ssh backup job to my desktop. She had been copying her My Documents directory and pasting it into her home directory on my machine using Samba, which took eons. I also added it to Windows Task Scheduler, which is utter crap compared to cron. After the first 8 hour run (~80 GB of music), it takes seconds and I don’t have to wonder about compliance.

Here are links to some of the things I wrote:

Cygwin rsync script, backup batch file to run Cygwin script, laptop sync bash script, sshfs mount script, sshfs umount script

For the record, I’m neither a programmer nor a system administrator. I’m just a librarian.

 LC Multi-State Project

Tuesday, January 8th, 2008

LC made an announcement about the Multi-State Initiative. I saw this on the diglib list. NCSU Libraries is a partner on the North Carolina geospatial project.

DIGITAL PRESERVATION PROGRAM ADDS NEW PARTNERS TO PRESERVE STATE
GOVERNMENT DIGITAL INFORMATION

Digital Preservation Network Grows to More Than 100 with New Partners

Twenty-one states, working in four multistate demonstration
projects, are today joining the Library of Congress’s National Digital
Information Infrastructure and Preservation Program (NDIIPP) in an
initiative to catalyze collaborative efforts to preserve important state
government information in digital form.
States face formidable challenges in caring for digital records
with long-term legal and historical value. A series of Library-sponsored
workshops held in 2005 and involving all states revealed that the large
majority of states lack the resources to ensure that the information
they produce in digital form only, such as legislative records, court
case files and executive agency records, is preserved for long-term
access. The workshops made clear that much state government digital
information-including content useful to Congress and other
policymakers-is at risk of loss if it is not now saved.
“The records of state government are of keen interest to
Congress as well as to the states themselves, and it is critical that we
work with state archives and libraries in their efforts to ensure that
this information remains available and accessible,” said Librarian of
Congress James H. Billington. “I am committed to having the Library
play a leadership role in encouraging the preservation of these
important resources.”

These partnerships expand the NDIIPP network to include state
government agencies. In August, the network added partners from the
private sector in an initiative called Preserving Creative America. With
these new partners, the NDIIPP network now comprises well over 100
members, including government agencies, educational institutions,
research laboratories and commercial entities.
“The Library of Congress is eager to welcome state partners in
our growing digital preservation network,” said Associate Librarian
for Strategic Initiatives Laura E. Campbell, who is leading NDIIPP for
the Library of Congress. “These projects will help ensure long-term
access to critical information for both Congress and the American
people.”
The projects will collect several significant categories of
digital information such as geospatial data, legislative records, court
case files, Web-based publications and executive agency records. Each
project will also work to share tools, services and best practices to
help every state make progress in managing its digital heritage.
The states projects are the most recent initiative of NDIIPP
(www.digitalpreservation.gov), authorized by Congress in December 2000.
A cornerstone of NDIIPP has been the establishment of a broad network of
partners committed to the continuing selection, collection and
preservation of significant digital content that is at risk of loss. The
total amount of the funds being made available to the new partners is
$2.25 million.
Following are the lead entities and the focus areas of the
projects:

Arizona State Library, Archives and Public Records: Persistent Digital Archives and Library System.

Arizona will lead this project to establish a low-cost, highly automated information
network that reaches across multiple states. Results will include
techniques for ingesting mass quantities of state data as well as
developing a strong data management infrastructure. Content will include
digital publications, agency records and court records. States working
in this project are Arizona, Florida, New York and Wisconsin.

Minnesota Historical Society: A Model Technological and Social Architecture for the Preservation of State Government Digital Information.

The project will work with legislatures in several
states to explore enhanced access to legislative digital records. This
will involve implementing a trustworthy information management system
and testing the capacity of different states to adopt the system for
their own use. Content will include bills, committee reports, floor
proceedings and other legislative materials. States working in this
project are Minnesota, California, Kansas, Tennessee, Mississippi,
Illinois and Vermont.

North Carolina Center for Geographic Information and Analysis: Multistate Geospatial Content Transfer and Archival
Demonstration.

Work will focus on replicating large volumes of
geospatial data among several states to promote preservation and access.
The project will work closely with federal, state and local governments
to implement a geographically dispersed content exchange network.
Content will include state and local geospatial data. States working in
this project are North Carolina, Utah and Kentucky.

Washington State Archives: Multistate Preservation Consortium.

The Washington State Archives will use its advanced
digital archives framework to implement a centralized regional
repository for state and local digital information. Outcomes will
include establishment of a cost-effective interstate technological
archiving system, as well as efforts to capture and make available
larger amounts of at-risk digital information. Content will include
vital records, land ownership and use documentation, court records and
Web-based state and local government reports. States working in this
project are Washington, Colorado, Oregon, Alaska, Idaho, Montana,
California and Louisiana.

 DCC Wrapping Up

Thursday, December 13th, 2007

The Digital Curation Conference is wrapping up. There were some salient take-aways.

  1. The UK has a much more focused national strategy for digital library development due largely to the structure of funding organizations.
  2. Relatedly, the one off, silo-ed, unfederated, independent architecture of US digital collections, both within and without individual institutions, is due largely to the grant situation in the US.
  3. Holding a digital library/data center/information management conference 2 stories below ground without wireless Internet or mobile phone reception is both hugely annoying and enormously beneficial.
  4. The Library of Congress National Digital Information Infrastructure Preservation Program, while providing me with interesting work, doesn’t seem like it will provide the national direction that I think is necessary. I trust the digital preservation experience will be valuable to others, though.
  5. Conferences that include food and Internet in the conference registration rock. It’s nearly impossible for most academics I know to request funding for Internet.

 Up for Lunch Later?

Friday, December 7th, 2007

This is an IM conversation I had today with a friend.  Names have been changed to protect the sociopathic.

(10:56:16 AM) Friend: you up for lunch later?
(10:56:31 AM) me: Uh, well, why the hell not, right?
(10:57:01 AM) Friend: exactly. you would not be happy if I die this weekend and you gave up the last chance to eat lunch with me.
(10:57:34 AM) me: No, but my guilt would be assuaged by the thought of going through your stuff and reallocating that cool keyboard to my cube.
(10:57:37 AM) me: know what I mean?
(10:58:40 AM) Friend: oh yea, I’m a sociopath. I don’t feel any emotional attachment to other people. I live in a world of cardboard cutouts. I wouldn’t feel bad very long at all.

Maybe I should submit this to bash.org.  Or, maybe I should work on developing relationships with people who are sane.