Friday, August 9, 2019

Data Compression

Idea of compression? 

Plenty of disk space .. doen't effect most of us.
Wrong impression... compression is used behind the scences in computer systems quite often.

Even your voice is compressed when you speak on the phone: telephone compnay can chieve a vstly superior utiliation fo their resources if they compress voice data before transporting it.

Lossless Compression: The ultimate free lunch

two very different types of compression: lossless and lossy. lossless compression algorithm cant produce dramatic savings on every file. But a good algorithm will produce substantial savings on certain common types of files.

Humans do this all the time without even thinking about it Eg. weekly calenders - each hours for 5 days - 40 hrs, . but you may says somethings like "monday and tuesday are full and I am booked from 1pm to 3pm on Thursday and Friday, but otherwise available"

basic idea is to find parts of the data that are identical to each other and use some kind of trick to describe those parts more efficiently.

AAAAAAAAAAAAAAAAAAAAABCBCBCBCBCBCBCBCBCBCAAAAAADEFDEFDEF

I may write like 21A, 10BC, 6A, 3DEF. In this case 56 characters is compressed to only 16. less than one-third. this is call Run Length Encoding. It encodes a "run" of repetitions with the "length" of that run.

Run Length Encoding

RLE is only useful for compression of specific types of data. It is used mostly in combination with other algorithms like Huffman coding. Problem with RLE is repeatitions in the data has to be adjacent. on other data in the repeated parts.. eg. ABXABYAB

Fax machines can take advantage of RLE as  faxes are black and white documents  which get converted to larger number of dots with each dots being either black or white.

Same-as-Earlier Trick

DJKSEMSIT-SM-DJKSEMSIT-ALXDF-O-DJKSEMSIT-ALXDF-DJKSEMSIT-EW-ALXDF

"-" are just put in so that we can read.. in actual sequece there is no dash.

56 characters..

think of go back to and copy.. b for back, c for copy.. b10c6 means go back 10 and copy 6

DJKSEMSIT-SM-b11c9-AlXDF-O-b16c15-b14c9-EW-b16c5

40 characters

There is one more interesting twist. FG-FG-FG-FG-FG-FG-FG-FG. 8-repretitions. FG-FG-FG-FG-b8c8 or FG-b2c14.

Shorter-Symbol Trick

Computers stored everything as a number and interpreted as a letter according to some fixed table. "a" is represented by 27, b is 28, c is 29. Then abc is stored as 272829. A is 01 and Z is 26, "a" is 27 and "z" is 52", a stroke is 80 and U stroke is 99.  "Meet your fiance there." will be 1331314600514147440032352740298200463431443166

so this message is 46 digits long.

This trick is something Humans do all the time without even thinking about it. USA - United State of America - we use 3 letter codes for a the full 24 letter phrase it stands for. How about "The sky is blue in color"? The key difference is that one of these phrases is used much more often than the other. we can save a lot more by abbreviating a frequently used phrase instead of one that is rarely used.

e and t are commonly used, let cut them down to single digit.

M eet  y o u r   f i a n c e   th ere.
1388900514147440032352740298200934844866

There is a  fatal flaw, as computer doesn't store spaces between individual letters. Let see first 5 digit 13889, it can be interpreted as 13-8-8-9 or 13-88-9 or 13-8-89.

There is no way to tell one digit code or two digit code. We have to make some sacrifice: some of our codes will be actually get longer. The ambiguoug two-digit codes that start with 8 or 9 will become three-digit codes that do not start with 8 or 9. Anything start with a 7 is a three digit code, any starting with an 8 or 9 is a single digit code. anything starting with 0,1,2,3,4,5 or 6 is the same two-digit code.

M eet  y o u r   f i a n c e    th ere. 13889005141474400323527402978200934844866

46 digits to only 41.

ZIP
Step 1 same as earlier trick, back and copy
Step 2 most common symbols and construct table
Step 3 transform again by translating into numeric codes defined in step 2

the table is stored together with the file, becomes a ZIP file. In a real ZIP file the original file is broken up into chunks and each chunk can have a different numeric code table.

Lossy Compression

For image or audio, as long as it look the same and sound the same, it doesn't really matter whether the file storing is exactly the same as the file stores that song on a compact disc. Sometimes lossy compression is used in much more extreme way such as compression is being used aggressively to ake the fiale size of the video or images very small resulting in low-quality video and images.

Leave-it-out Trick

Simple and useful trick. first understand how black and white picture are stored, grey scale and pixels and resolution.For 320 x 240 picture, number of pixels rather large, 320*240 = 76,800. Extremely simple technique: ignore, or leave out every second row or second column of pixels. It results in picture of 160 x 120. The size will be only one-quarter of the original. Reduce again and the size will be only 6% of original size.

But when the picture is decompress back to original size, computer has to guess the colors of missing pixels. Most of the visual feature may be retained but there is some definite loss of quality and details, especially in complex areas.

But computer do indeed leave out information to achieve lossy compression, but they are much more careful. JPEG strategy for  is complicated. But basic flavor is fairly straightforward. JPEG divides the whole image into small squares of 8x8 pixels. Each is compressed separately.  If the square happens to be all one color, the entire square is represented by a single number. If the square is mostly the same color, with a few very slight differences, computer can decide to represent the square by a single number, with only a small error. JPEG works well with picture. MP3 and AAC generally use the same high-level approach as JPEG.

Origin of Compression Algorithms

The same-as-earlier trick is the main compression methods used in ZIP file and it is known as LZ77 algorithm. It was invented by two Israeli computer scinetists, Abraham Lempel and Jacob Ziv, and published in 1977.

Claude Shannon the bell labs scientist who founded the fiedl of information theory withhis 1958 paper. Not only error correcting codes, but also figure importantly in the emergence of compression algorithms.

error-correcting codes and compression algorithms are two sides of the same coin. Error-correcting codes can be viewed as a principled way of adding redundancy to a message or file.
Compression algorithms do the oppostie: they remove redundancy from a messge of file.

Shannon, in his seminal 1948 paper, included a description of one of the earlier compression techniques, Shannon-Fano encoding. It is a particular way of implemneting the shorter-symbol trick. The Robert Fano's student solved the problem for achieving optimal compression for each individual symbol. The student was David Huffman, and his technique-now known as Huffman coding is another example of shorter symbol trick.


Saturday, December 31, 2011

7 segment LED by scanning
First the scanning is done with turn on time around 15 cycles @ 8MHz and off time around 2 cycles. When the scanning is done so fast different digit of 7segment display appeared messed up, even with the correct voltage (waveform) at the pin. This may be due to the delay caused by the current limiting resistor to each segment or may be due to the turn-off delay of the transistors.

The scanning timing is changed to 10 msec/ digit and the display are quite clear and bright.
Voltage Reference using TL431
TL 431 requires 1 mA current. When the current to the TL431 is less than 1ma due to the current drawn by load resistors, the reference voltage will become less than 2.5V.

Sunday, June 12, 2011

Accessing Blocked Websites

With the ease of configuration of Opera Unite Webproxy and ease of Wampserver, i will make use of my home computer as a webproxy.

Apache is only use from wampserver installation. MySQL, PHP are for my later development. CGI is implemented with active perl as in the following link.
http://www.webstuffscan.com/2006/12/21/accessing-blocked-websites-use-your-own-proxy-server-at-home/
The directory of wamp server is c:\wamp\www. Therefore, I reside the cgi script in that directory rather than htdocs.

It is ok to get through. But Opera Unite is not an SSL proxy. Therefore, I cannot get to facebook.

Google Sketch Up

After trying sketch up, i have to accept that this is the best tool i have ever used in sketching 3D. Except from the fact that a mouse is essential, ease of use weights way better than those CAD software I have been using in drawing. I will also try out other sketching tools such as BLENDER also since my experience is mostly on CAD rather than sketching.

Following video shows how efficient and easy to draw a chair with sketch up (We can also enter length, height, etc. precisely).

GAME 3D MODELS

3D models are available for purchase for the lazy and more of programmer rather than artists to develop 3D games. Models can be purchased here:
http://www.3drt.com/3dm/3dm.htm
http://www.game-stuff.com/
http://3dbud.com/
http://www.frogames.net/content-packs.html
http://dexsoft-games.com/
http://www.arteria3d.com/
http://www.garagegames.com/products/browse/artpacks
http://www.exchange3d.com/3D_catalog_game%20ready.html
http://www.turbosquid.com/
http://www.scifi-meshes.com/forums/
Google it. There are so many . . .

Saturday, June 11, 2011

Installing Andriod SDK

Andriod developers webpage (http://developer.android.com/sdk/index.html) recommended to download and install the executable installer "installer_r11-windows.exe". The installer always detect the JRE but cannot detect the JDK installed. This is because the installer searches "Java.exe" file at the "C:\Program Files\Java" folder alphabetically and sequentially. Usually JRE installed earlier reside in the folder and alphabetically, they are the first one to be found. Even with the path variable set to correct JDK directory, the installer always found the JRE only.

When I rename the "java.exe" the old JRE directory, the program can find the JDK installation and can continue with installation.

Or simply, download the "android-sdk_r11-windows.zip" file and unzip into a folder.

Thursday, February 17, 2011

Altium

It is quite troublesome to get a correct foot print in Altium. The footprint names follow IPC standard and not common name that is used in this industry. They do not have name like SOT23, DPAK. Matching the common name to standard names are a bit tiring when creating your own schematic library with footprints.

Friday, May 7, 2010

TESTING FAX THROUGH OUTLOOK

i am testing sending fax from multiple attachments of different kinds
such as word, excel and tif. It seems that Excel doesn't follow the
preset format in the file and send only the first sheet of the file.

Wednesday, March 31, 2010

Turning a robot with HMC6343


The HMC6343 is a solid-state compass module with tilt compensation from Honeywell. The HMC6343 has three axis of magneto, three axis of accelerometer, and a PIC core running all the calculations. What you get is a compass heading over I2C that stays the same even as you tilt the board. Solid-state (and many classic water based) compasses fail badly when not held flat. The tilt compensated HMC6343 is crucial for those real-world compass applications.

Features:
-I2C interface
-2 degree heading accuracy
-All in one compass solution
-Compass heading and tilt outputs
-Dimensions: 0.8x0.8"



1) Data Reading
Despite all the good things mentioned on the webpage, when i started to link to atmega128, I found out that I cannot read all 6 bytes in sequence as the example shown in the datasheet.

//Gets bulk data for the four different post types (Accel, Mag, Heading, Tilt)
void send_command(uns8 command_byte)
{
uns8 in_byte, i;

i2c_ack_polling(DEVICE_WRITE);

i2c_start();
i2c_send_byte(DEVICE_WRITE);
i2c_send_byte(command_byte); //Send command
//i2c_stop();

i2c_start(); //Repeat start (SR)
i2c_send_byte(DEVICE_READ); //Now ask the IC to report on the last command
for(i = 0 ; i < in_byte =" i2c_read_bytes();">


Rather, I have to send the device read address before reading each byte. The program becomes like the one below. It requires more time in reading heading data as the data transfer is in serial.

void getheading(){
uns8 hmc_data[6];
bled_on;
i2c_start();
i2c_write(0x32);
i2c_write(0x50);
delay_ms(2);
i2c_start();
i2c_write(0x33);
hmc_data[0] = i2c_read(0);
i2c_start();
i2c_write(0x33);
hmc_data[1] = i2c_read(0);
i2c_start();
i2c_write(0x33);
hmc_data[2] = i2c_read(0);
i2c_start();
i2c_write(0x33);
hmc_data[3] = i2c_read(0);
i2c_start();
i2c_write(0x33);
hmc_data[4] = i2c_read(0);
i2c_start();
i2c_write(0x33);
hmc_data[5] = i2c_read(0);
i2c_stop();
heading = hmc_data[0] <<>
tilt = hmc_data[2] <<>
roll = hmc_data[4] <<>
if ((tilt & 0xE000) != 0x0000){ tilt = ~tilt; }
if ((roll & 0xE000) != 0x0000){roll = ~roll; }
}


2) Data Reading During Movement

It is also found out that after turning the robot in a few milliseconds, the heading data from HMC6343 is not yet updated. About 400 ms is required to get a stable reading. The datasheet described that the normal updating rate is 5Hz. Therefore, it is very clear that if we read the heading and check at every step of the motor, the motor steps should take longer than 200 msec. Therefore, it is easier to calculate the number steps it should turn to make the require degree and read the heading from the compass after taking all steps. If there is any difference in the heading desired and heading read from the compass, another the number of steps need to compensate is calculated again. Then turn again until the error is smaller than designed allowance.

The calculation needs to be careful when it crosses the north. If the robot was turn 100 degree
from the heading 300 degrees to the right (east), it will pass 360 degrees and ends at 40 degrees
heading. If the robot is at heading 100 degrees and turn left (-200) degrees then the final heading read from the compass is 260 degrees.


So the total degrees turned should be calculated differently for right and left turns. Especially the amount of degrees turned should be calculated carefully if the compass turning cross the north.

The simplified turning code can be like the one below:


void turn(float deg){
signed char lr;
unsigned char noofturn,i;
int turnerror;
int oldhead;
if (deg > 360) deg = deg-360;
if (deg < -360) deg = deg+360;
turnerror = (int)(deg * 10);
noofturn = (int)abs(turnerror)/DEGPERTURN;
while (noofturn > 0){
lr = sign(turnerror);
noofturn = (int)abs(turnerror)/DEGPERTURN;
if (noofturn>0) {
getheading();
oldhead = heading;
for (i =0; i
if (lr>0)left();
else right();
}
delay_ms(400); // it is found out that 400msec delay
// can stabilized the heading reading
getheading();
if (lr>0) {
if (heading > oldhead) {
turnerror = turnerror - (3600 - heading + oldhead);
}
else {
turnerror = turnerror - (oldhead - heading);
}
} else {
if (oldhead > heading) {
turnerror = turnerror + (3600 - oldhead + heading);
}
else {
turnerror = turnerror + (heading - oldhead);
}
}
}
}
}

Monday, March 15, 2010

AC powered LED circuits



The current or LED can be varied. In my case, I assume 10mA.
For 240V AC, to get around 20mA -> 240 / 10m = 22K ohm and
power dissipated on resistor will be around 3W. The resistor will run
hot. Usually a capacitor can be used as a resistor.
Xc = 1 / (2 pi f C)
150 nF/400V will work fine. The capacitor should be non-polarised.

LED needs to be protected from large reverse voltages, another LED
in the reverse direction might just work. Or just put a diode like 1n4007.

Using capacitor only, will have a problem with inrush current during
switching and it is preferable to have a resistor in series with the capacitor.
A 1K or 2K resistor can be connected in series to the capacitor to reduce
the maximum inrush current to 240mA with 1K ohm or 120mA with 2K ohm.

After turning off, we also like to have the capacitor to discharge quickly. Any
voltage residue on the capacitor may cause shock. Therefore, a resistor is
usually installed in parallel with the capacitor. A 220k ohm resistor in parallel
with the capacitor should remove the 70% of residue in 35msec.

Some links to such a designs:

I downloaded their schematic and put here for access in case their site goes offline:




Followers