Tarh̯untaš

DLI Books to DJVU

As I’m also one of those reading books from DLI, and not particularly liking to fetch one by one the pages in TIF format, I’ve been tinkering this script for about a year, and I think it’s fairly decent by now. It expects you to give links from the search results.

Perhaps it might be useful to someone else. Please do tell about how it fared, if you try it.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
#!/bin/bash
 
# test if anything was given to the script
if [[ $1 = "" ]]; then
    echo "You must give the URL to be processed."
    echo "Enter 'ocr' as second argument to use ocrodjvu."
    exit 1
fi
 
url=$1
 
# converts % codes
url=`echo "$url" | sed "s/%20/ /g"`
url=`echo "$url" | sed "s/%27/'/g"`
url=`echo "$url" | sed "s/%28/(/g"`
url=`echo "$url" | sed "s/%29/)/g"`
 
# gets the variables
 title=`echo "$url" | sed -r "s/.*title1=([^&]*)&.*/1/g"`
author=`echo "$url" | sed -r "s/.*author1=([^&]*)&.*/1/g"`
 pages=`echo "$url" | sed -r "s/.*pages=([^&]*)&.*/1/g"`
  path=`echo "$url" | sed -r "s/.*url=([^&]*)/1/g"`
 
# shows them
echo ""
echo -e "Author:t$author"
echo -e "Title:t$title"
echo -e "Pages:t$pages"
echo -e "Path:t$path"
 
# assembles the filename
filename=`echo "$author" - "$title"`
 
# tests if the directory named $filename already exists,
# if not, it's created, then changes to its path
if [ -d "$filename" ]; then
    echo -n "The directory '"$filename"' already exists. "
else
    mkdir "$filename"
fi
 
cd "$filename"
 
# creates directories to hold the .tif and .djvu files
if [ ! -d tif  ]; then
    mkdir tif
fi
 
if [ ! -d djvu ]; then
    mkdir djvu
fi
 
cd tif
 
# if there is a 'last' file, makes the script continue
# from that; otherwise, starts from 1
if [ -f last ]; then
    firstpage=`cat last`
    echo "Resuming from page $firstpage..."
    firstpage=`echo "$firstpage+1" | bc`
else
    firstpage=1
fi
 
echo ""
tput sc
 
# iterates the download for each file
# the exact path is a hack that happens to work...
# it avoids downloading files again by checking the
# timestamp of each file, that is the "-N" option
for i in $(seq $firstpage $pages); do
 
    echo -n Page $(printf "%08d" $i)...
 
    if [[ $path == *data1* ]]; then
        wget -N -q --random-wait http://www.new1.dli.ernet.in/$path/PTIFF/$(printf "%08d" $i).tif
    elif [[ $path == *data2* ]]; then
        wget -N -q --random-wait http://www.new1.dli.ernet.in/$path/PTIFF/$(printf "%08d" $i).tif
    elif [[ $path == *data3* ]]; then
        wget -N -q --random-wait http://www.new1.dli.ernet.in/$path/PTIFF/$(printf "%08d" $i).tif
    else
        wget -N -q --random-wait http://www.new.dli.ernet.in/$path/PTIFF/$(printf "%08d" $i).tif
    fi
 
    if [ $? = 0 ]; then
        echo -n " done."
        echo "$(printf "%08d" $i)" > last
 
        # converts to djvu
        cjb2 $(printf "%08d" $i).tif ../djvu/$(printf "%08d" $i).djvu > /dev/null 2>&1
 
        tput el1
        tput rc
 
    else
        echo "error!"
    fi
 
done
 
cd ..
 
# assembles the djvu pages in one bundle
djvm -c ../"$filename".djvu djvu/*djvu
 
# ocr
 
if [ "$2" = "ocr" ]; then
    ocrodjvu -o "$filename (ocr)".djvu "$filename".djvu
fi
#!/bin/bash

# test if anything was given to the script
if [[ $1 = "" ]]; then
	echo "You must give the URL to be processed."
	echo "Enter 'ocr' as second argument to use ocrodjvu."
	exit 1
fi

url=$1

# converts % codes
url=`echo "$url" | sed "s/%20/ /g"`
url=`echo "$url" | sed "s/%27/'/g"`
url=`echo "$url" | sed "s/%28/(/g"`
url=`echo "$url" | sed "s/%29/)/g"`

# gets the variables
 title=`echo "$url" | sed -r "s/.*title1=([^&]*)&.*/1/g"`
author=`echo "$url" | sed -r "s/.*author1=([^&]*)&.*/1/g"`
 pages=`echo "$url" | sed -r "s/.*pages=([^&]*)&.*/1/g"`
  path=`echo "$url" | sed -r "s/.*url=([^&]*)/1/g"`

# shows them
echo ""
echo -e "Author:t$author"
echo -e "Title:t$title"
echo -e "Pages:t$pages"
echo -e "Path:t$path"

# assembles the filename
filename=`echo "$author" - "$title"`

# tests if the directory named $filename already exists,
# if not, it's created, then changes to its path
if [ -d "$filename" ]; then
	echo -n "The directory '"$filename"' already exists. "
else
	mkdir "$filename"
fi

cd "$filename"

# creates directories to hold the .tif and .djvu files
if [ ! -d tif  ]; then
	mkdir tif
fi

if [ ! -d djvu ]; then
	mkdir djvu
fi

cd tif

# if there is a 'last' file, makes the script continue
# from that; otherwise, starts from 1
if [ -f last ]; then
	firstpage=`cat last`
	echo "Resuming from page $firstpage..."
	firstpage=`echo "$firstpage+1" | bc`
else
	firstpage=1
fi

echo ""
tput sc

# iterates the download for each file
# the exact path is a hack that happens to work...
# it avoids downloading files again by checking the
# timestamp of each file, that is the "-N" option
for i in $(seq $firstpage $pages); do

	echo -n Page $(printf "%08d" $i)...

	if [[ $path == *data1* ]]; then
		wget -N -q --random-wait http://www.new1.dli.ernet.in/$path/PTIFF/$(printf "%08d" $i).tif
	elif [[ $path == *data2* ]]; then
		wget -N -q --random-wait http://www.new1.dli.ernet.in/$path/PTIFF/$(printf "%08d" $i).tif
	elif [[ $path == *data3* ]]; then
		wget -N -q --random-wait http://www.new1.dli.ernet.in/$path/PTIFF/$(printf "%08d" $i).tif
	else
		wget -N -q --random-wait http://www.new.dli.ernet.in/$path/PTIFF/$(printf "%08d" $i).tif
	fi

	if [ $? = 0 ]; then
		echo -n " done."
		echo "$(printf "%08d" $i)" > last

		# converts to djvu
		cjb2 $(printf "%08d" $i).tif ../djvu/$(printf "%08d" $i).djvu > /dev/null 2>&1

		tput el1
		tput rc

	else
		echo "error!"
	fi

done

cd ..

# assembles the djvu pages in one bundle
djvm -c ../"$filename".djvu djvu/*djvu

# ocr

if [ "$2" = "ocr" ]; then
	ocrodjvu -o "$filename (ocr)".djvu "$filename".djvu
fi

A Collatz Conjecture’s Bonsai

A Collatz Conjecture's Bonsai

I’ve recently made this, after seeing xkcd‘s cartoon about the Collatz Conjecture. May be just me, but I rather like it. It was made iterating from 1 to 10000, and calculating the possible paths of each number. If it leads only to its double, then it’s blue; if it leads also to an odd number, red; the grey ones didn’t get iterated, but were calculated implicitly by doubling the previous ones. Et voilà!

BASH Miscellanea

How to read a $file line by $line:

cat $file | while read line; do echo $line; done

How to extract URLs from a file (from here): Grab this sed script, make it executable (chmod +x list_urls.sed), then:

cat * | ../list_urls.sed

How to count the number of files inside a $directory (from here):

ls -1 $directory | wc -l

BWV 912

Consta que ouço as tocatas de Bach desde os nove anos. Até aqui tive tempo de gostar delas sem saber direito o que eram, enjoar delas e as esquecer, lembrar que existem de volta, enjoar de novo, lembrar delas no surto de ouvir todo o BWV, enjoar de novo, e voltar a gostar delas. Nos primeiros surtos eu ouvia todo o CD esperando pela fuga da toccata em Cm, BWV 911, que era a minha predileta. Seguindo a tendência do meu humor pelos anos, que é de, espero ; ), ser cada vez menos caquético e sério, agora prefiro a seguinte, em D. A fuga é o que lembro quando meu humor está excepcionalmente bom, e parece ser, aqui, o caso:

Lingua Avium

Pacúvio, no De Divinatiōne de Cícero:

istis qui linguam avium intelligunt
Plusque ex alieno iecore sapiunt quam ex suo,
Magis audiendum quam auscultandum censeo.

Esses que entendem a língua das aves,
e que mais sabem do fígado alheio que do seu,
são mais para ouvir que para escutar, eu creio.

Montaigne citou isso nalgum lugar, e Charles Cotton, cuja peruca certamente poderia me fazer supor seu sobrenome, traduziu-o para o inglês, em 1680 e lá vai pedrada, assim:

Who understand what language birds expresse,
By their owne than beasts-livers knowing lesse,
They may be heard, not hearkned to, I guesse.

Chorus

From the preface of Aulus Gelius’ Noctes Atticae, Aristophanes’ Frogs, 354-9:

All evil thoughts and profane be still:
far hence, far hence from our choirs depart,
Who knows not well what the Mystics tell,
or is not holy and pure of heart;
Who ne’er has the noble revelry learned,
or danced the dance of the Muses high;

(…)

I charge them once, I charge them twice,
I charge them thrice, that they draw not nigh
To the sacred dance of the Mystic choir.
But ye, my comrades, awake the song,
The night-long revels of joy and mirth
which ever of right to our feast belong.

Prefaces

I tend to hate prefaces, like one of full hundred pages out of a book of just two. Maybe I am just not polite enough to let the author say whatever he wants before saying what I want, but still… I just can’t help it. Specially the damned baroque prefaces in italics, never read even one worth the time. An example? This is a very short one, with a typeface legible enough.

Now, I’ve just stumbled upon this one, by a certain Morleigh (?), Life in the west: back-wood leaves and prairie flowers:

I consider a Preface to be a work of supererogation, and never read one with common patience in my life. Nevertheless, I think it right to inform the reader, that three of the following papers have already met the public eye, in the pages of a leading London periodical.

Agreed. But he could at least give his name somewhere, it tends to be important.

Etymologia Nova I

whiskey dicitur quia saepe viscerās dolēre facit.

How to activate wake-on-lan

First you need to see if the network card supports it. I had to flash a newer version of a fairly ancient computer to get this to work (you need the ethtool package):

#ethtool eth0 | grep "Wake"

If it spits something like

Supports Wake-on: g
Wake-on: d

then we are good to go : ). To activate it:

#ethtool -s eth0 wol g

To find out the MAC address:

#ifconfig

Lets test it now. Halt the computer, then (you will need the wakeonlan package):

#wakeonlan [the machine's MAC adress]

Et voilà! … but, you find out, later, that it only works if you use the ethtool command at each boot. Instead like editing /init.d/networking like you did before ; ), you find that it is easier to edit /etc/network/interfaces, and add these two last lines:

iface eth0 inet dhcp
post-up /usr/sbin/ethtool -s $IFACE wol g
post-down /usr/sbin/ethtool -s $IFACE wol g

Only then it really works.

How to change the LCD brightness

Acer Palmatum

I have an Acer Aspire 4530, and I wanted to change the background brightness remotely (now, that is really useful ; ) )  with Debian Squeeze and SSH. So:

#echo 9 > /sys/class/backlight/acpi_video0/brightness

The maximum value is given by this file:

cat /sys/class/backlight/acpi_video0/max_brightness

Much better than using the keys, ha.