pouët.net

The random Copy&Paste thread

category: residue [glöplog]
http://www.nopantsday.com/wp/
tabyspelsallskap.ning.com
http://content4.novojoy.com/g16033/14.jpg
added on the 2009-05-02 00:10:41 by Muerto Muerto
AddHandler php5-script .whale
Sophie Wilson, formerly Roger Wilson, is a British computer scientist. She was educated at Cambridge University. In 1978, she designed the Acorn Micro-Computer, the first of a long line of computers sold by Acorn Computers Ltd

(oh hauerha! :))
added on the 2009-05-14 21:51:17 by xyz xyz
y once I believe..
So it's just Barkley and Cyberdwarf. Great. Gold Encrusted Pick Toss hits four
times, each dealing around 85 damage per hit. Gemstorm also hits all party
members for around 200 damage, so it's a real doozy. Level 2 Melf's Acid Arrow
not only inflicts around 220 damage, but also inflicts Diabetes on its
intended target. As you can see, Duergar is very powerful, and having only two
party members makes this battle a lot harder than normal bouts.
The random Copy&Paste thread
added on the 2009-05-16 11:32:09 by toxie toxie
The crappy random paste thread
added on the 2009-05-16 14:23:08 by Joghurt Joghurt
added on the 2009-05-14 21:51:17 by hermes
hermes
y once I believe..
added on the 2009-05-16 05:27:16 by bittin
bittin
So it's just Barkley and Cyberdwarf. Great. Gold Encrusted Pick Toss hits four
times, each dealing around 85 damage per hit. Gemstorm also hits all party
members for around 200 damage, so it's a real doozy. Level 2 Melf's Acid Arrow
not only inflicts around 220 damage, but also inflicts Diabetes on its
intended target. As you can see, Duergar is very powerful, and having only two
party members makes this battle a lot harder than normal bouts.
added on the 2009-05-16 06:13:27 by :confused:
:confused:
The random Copy&Paste thread
added on the 2009-05-16 11:32:09 by toxie
toxie
The crappy random paste thread
added on the 2009-05-16 14:23:08 by Joghurt
Joghurt

previous page
go to page of 9

post a new reply
message:
damn, saw naked pics of yours or maybe the one in pic is similar to you .... crazy lol
added on the 2009-05-20 16:29:20 by BarZoule BarZoule
Terrornews #21 written as always by Dj mElLoW nOiSe ... [c] 1995 INFECT TM

  • -------------------------------------------------------------------------------
    Today : Friedhuhn, die Ossischlampe !

    Nein! Das sah gar nicht gut aus... Friedhuhn watschelte ueber den Gehweg.
    Heute war der 31.12. und für's Anschaffen sah's gar nicht gut aus. 2 Kunden
    bis jetzt und einer steckte ihr auch noch nen China-Böller unten rein.
    Mit völlig zerfetzter Kiste rennt sie seit 20 Minuten hin und her. Kein Kunde
    in Sicht. Jetzt ist Schluß. Sie hielt einen Polizeiwagen an , haute dem Beamten
    den Kopf ab, packte ihn in das Handschuhfach und parkte CS-Gas dem Fahrer in's
    Maul. Ja... Das war der blanke Haß. Sie nahm die Pumpgun aus dem Kofferraum
    und hielt den nächsten Laster an. Nach Extremvergewaltigung an einem 78jährigen
    Lasterfahrer rauschte sie mit dem Teil in die nächste City. Yo! Nun waren sie
    dran die Rentner. Nix mit Konsummarken und mehr Kartoffeln im Heim. BLut,
    Köpfe und Eingeweide standen bei Friedhuhn ganz oben auf der Liste. Ups --
    Der Kinderwagen und die Frau... Passiert ! Alk in's Maul und ab geht der Gaul!
    Friedhuhn kam 53mal in der Minute. Nach 2 Pommesständen, 1 Waldhütte und 4
    MFS-Lebkuchen Fabriken fuhr sie sich über den Haufen, starb und erwürgte ...
    Eigentlich nicht schlecht für'ne 42 jährige angebombte Drecksau. Bissl mehr
    Action hätte sein können, aber es heißen ja nicht alle Gertrud und Rudolf-
    added on the 2009-05-20 19:03:23 by gentleman gentleman
    NSLog(@"Book initialised.\nTitle: %@\nDescription: %@\nDownloaded: %@", self.title, self.description, self.isDownloaded ? @"Yes" : @"No");

    (i'm wondering why the hell this is in one of my class files, the app is nothing to do with books :( )
    added on the 2009-05-20 19:10:49 by psonice psonice
    11111100000000000000011111100000000000000011111110000000000000011111111110000000111111111
    11110000000000000000011111000000000000000011111000000000000000011111111100000000011111111
    11100000000000000000011110000000000000000011110000000000000000011111111000011000011111111
    11000001111111111111111100001111111111111111100000111111111111111111111000111110001111111
    10000111111111111111111000011111111111111111100001111111111111111111111000100110001111111
    10000110000000000000011000110000000000000011100011000000000000011111110000100010000111111
    10001100000000000000011000110000000000000011000010000000000000011111110001100011000111111
    10001000000000000000011000110000000000000011000110000000000000011111110001000011000111111
    10001000011111111111111000110001111111111111000110001111111111111111100001000001000011111
    10001000000000011111111000110000000000001111000110001100000000011111100011000001100011111
    10001100000000000111111000110000000000001111000110001100000000011111000011000001100011111
    10000110000000000011111000110000000000001111000110001100000000011111000010000000100001111
    11000011111111000001111000111111111111111111000110001111111100011111000110000000110001111
    11000001111111100000111000111111111111111111000110001111111100011111000110000000110001111
    11100000000000110000111000110000000000001111000110001100000100011110000100001000010000111
    11110000000000011000111000110000000000001111000110001100000100011110001100011000011000111
    11111110000000001000011000110000000000001111000110001100000100011110001000011100011000111
    11111111111110001000011000110001111111111111000110001111000100011100011000011100001000011
    10000000000000001000011000110000000000000011000110000000000100011100011000000000001100011
    10000000000000011000111000110000000000000011000010000000000100011100010000100000001100011
    00000000000000010000111000110000000000000011100011000000000100011000110000100000000100001
    11111111111111110000111000011111111111111111100001111111111100011000110001111111111110001
    11111111111111000001111100001111111111111111100000111111111100010000100001111111111110001
    00000000000000000011111110000000000000000011110000000000000000000001100001100000000000000
    10000000000000000111111111000000000000000011111000000000000000000001100001100000000000000
    10000000000000011111111111100000000000000011111110000000000000000001000011100000000000000
    added on the 2009-05-26 20:17:01 by gentleman gentleman
    0001110011111111100000000000000000111111111111111111111111111111111110000
    0001100001111111100000000000000000111111111111111111111111111111111111000
    0001100001111111000000000011110000111111111111111111111111111111111111100
    0001100000111111000000000011111000011111111111111111111111111111111111110
    0001100000111110000000000011111000011111111111111111111111111111110000000
    0001100000111110000000000011111100011111111111111111111111100000000000000
    0001100001111110000000000011111100111111111111111111111100000000000000000
    0000100001111000000000000001111100111111111111111111110000000000000000000
    0000110000111100000000000001111100111111111111111111111111000000000000000
    0000011000111100000000000000010001111111111111111111111111111000000000000
    0000001000011100000000000000000011111111111111111111111111111110000000000
    0000011111111000000000000000001111111111111111111111111111111111000000000
    0001111111111111111111000011111111111111111111111111111111111111100000000
    0000111111111111111111111111111111111111111111111111111111111111110000000
    0000000000111111111111111111111111111111111111111111111111111111111000000
    added on the 2009-05-27 14:17:36 by gentleman gentleman
    Code: forceware 183, private working set: - 4072k: end: new engine - 3908k: begin: load plugin tkfreetype2 - 3988k: end: load plugin tkfreetype2 +80k - 3988k: begin: load plugin tkopengl - 4788k: end: load plugin tkopengl +800k - 5032k: begin: load plugin tksdl - 7072k: end: load plugin tksdl +2040k - 7332k: begin: load plugin tkui - 7392k: end: load plugin tkui +60k - 9644k end: scanner pass +5572k (since end: new engine) (2980k plugins, 2592k other) - 16284k end: compiler pass +6640k - 20936k <before openwindow> (total = 20936-4072 = 16864k for scan/compile/init pass) -> 16864/98 = ~172kbyte per class (and module) (inc. everything) (vs. 18-Apr-2009 tks-preview.exe: -5MBytes, ~30% saved) - 29268k <after openwindow> +8332k - 29476k <ui running> (stable) +208k [...] compile time: 266 millisec (2.71 millisec avg. per module). [...] optimize time: 16 millisec (0.16 millisec avg. per module). [...] #modules = 98 [...] #tokens = 134285 [...] #strings = 10011 [...] #tokenchars = 192071 [...] #lines = 28379 [...] #unopt.nodes = 50851 [...] #opt.nodes = 50076 [...] #nodebytes = 2670816 / 2883584 [...] #classes = 125 [...] #methods = 1330 [...] #members = 505 [...] #t.methods = 11760 / 24144 [...] #t.members = 2920 / 5442 [...] #classinits = 200 [dbg] Object::object_counter = 44212 [dbg] (38950 strings) [dbg] (3474 script class instances) [dbg] CachedObject::object_counter = 19584 [dbg] CachedObject::refname_counter = 16311 [dbg] CachedObject::copyname_counter = 3118 [dbg] ObjectPool: current pool usage: [dbg] priority class 0: 8 / 2048 kbytes allocated, 0 kbytes used. [dbg] priority class 1: 136 / 2048 kbytes allocated, 48 kbytes used. [dbg] priority class 2: 492 / 2048 kbytes allocated, 398 kbytes used. [dbg] peak_char_size = 345344 [dbg] total_char_size = 240178 -> ~4.73 tokens per LOC -> ~0.35 strings per LOC -> ~1.7645 nodes per LOC -> ~21 lines per method -> ~608 bytes memory per LOC -> ~12614 total bytes memory per method -> ~2008 node bytes per method (2670816/1330) -> ~390 bytes memory per object -> ~4970 bytes memory per script class instance

    (the -> lines are just estimates but interesting anyway)
    (kudos to nvidia for fixing the forceware 175 bug regarding fluctuating and excessive memory use with gf8500gt on vista32 (the working set is now ~38mb smaller, ~29 instead of ~67 mbytes)

    JSYK.
    added on the 2009-06-03 00:48:21 by xyz xyz
    (and yea, it's a debug/test-run for a UI toolkit written in a VHLL)
    added on the 2009-06-03 00:59:23 by xyz xyz
    (correction: forceware 185.85 instead of 183)
    added on the 2009-06-03 01:12:54 by xyz xyz
    BB Image

    and it looks like this ;)
    added on the 2009-06-03 01:50:37 by xyz xyz
    Marilyn Manson has admitted he cried when his tour manager refused to supply him with drugs recently. The shock-rocker revealed he was left upset by the move, saying: "About two hours ago Steve [Manson's tour manager] wouldn't get me any drugs. My eyes watered. Then he asks me if I am OK! It's ridiculous. It should be, 'Here's the mirror - and now are you OK?'"
    added on the 2009-06-09 19:21:25 by orb orb

    0

    Langeweile? Zur Bewußtseinserweiterung. Informationsgewinnung.

    1

    Der eine mit dem Pferdeschwanz schnitt ein paar Hanfblüten klein und vermischte sie mit Tabak. Dann stopfte der andere einen kleinen Metallbecher und zog aus einer umgebauten Cola-Flasche zwei Gelbe. Die Luft war dick und haschischgeschwängert. Alle Wege führen nach Bagdad. Alter. Er wollte eigentlich nur kurz vorbeischauen um etwas zu fragen. Der Eimer teilte eine Runde langsamer aber tiefer Schläge aus.

    2

    Der eine saß auf dem Sofa, weder in Erwartung was war, noch was kommen würde. Der mit den Kotletten grinste plötzlich breit. Er stand auf und holte eine Videokassette. Sie zogen sich "Speed" rein. Danach zogen sie sich Speed rein. Auf der "Speed"-Kassette. Der andere schaltete irgendwann, keiner bemerkte es, von MTV auf einen Pornofilm um. Und die Pizza kam nicht.

    3

    Der mit der Brille schaute immer nur zu. Ihm war das ganze neu, schien es den anderen. Und er sagte das auch so. Wenn sie wüßten. Manche Dinge nimmt man eben lieber als Scherz, oder Übertreibung.

    4

    Er erkannte sie nur undeutlich durch den Lärm und den Nebel. Aber er hatte Recht gehabt. Etwas anderes konnte er sich auch nicht leisten. Der Entzug zehrte an seinen Kräften. Und trieb ihn hinaus, ins Freie. Doch er riß sich zusammen, schaffte eine Annäherung und fand heraus was er wissen wollte. Sie hatte drei Pillen eingeworfen, eine davon LSD-beschichtet. Und sie wollte noch mehr. Er hatte mehr. Sagte er.

    Er schob sie vor sich her durch den Durchgang zum Klo. Nach Links, zum Männerklo. Er schob sie in eine Kabine und schloß ab. Die Stille ließ alles unwirklich wirken. Sie war bereits mißtrauisch geworden. Er setzte sie grob auf die Schüssel. Blitzschnell zog er einen Plastikstreifen aus der Tasche und fesselte ihre Hände zusammen. Mit einem Klebeband verschloß er ihre Lippen.
    Sie sagte mmh mmh.
    Er sagte pssst, kein Problem, wir regeln das mit der Bezahlung heute nur ein bißchen anders. Sein Phallus sprang aus seiner Hose. Er zog die Vorhaut zurück und klebte eine kleine weiße Pille auf die schwitzende Eichel. Dann schob er seinen Schwanz unter ihr enges, bunt gestreiftes Hemdchen. Er schob ihn bis oben zwischen ihren Brüsten durch gegen ihr Kinn und drückte es hoch, so daß sie ihn durch ihre Brille entsetzt anstarren mußte. Er zog seinen Schwanz wieder zurück und stemmte ihn gegen das Klebeband vor ihren weichen Lippen. Mit einer Hand zog er es ab, mit der anderen blockierte er ihre Kiefer. Sein Phallus fuhr bis tief in ihre Kehle. Sie schnaufte und heulte durch die Nase, Rotz und Wasser liefen auf sein Schamhaar. Aber gierig leckte sie seine Eichel, bis ihre Zunge die kleine weiße Pille gefunden hatte. Er ergoß sich in ihren Mund und sie spülte die Tablette mit seinem Sperma
    herunter.
    Es war nicht so eine wie sie dachte, und sie fiel alsbald in tiefen Schlaf. Er streichelte ihr Haar, packte seinen Schwanz wieder ein und verschwand in die Nacht. Das Zittern hatte nachgelassen, aber nicht aufgehört.

    5

    Er war wieder auf dem Weg, auf der Suche. Schweißperlen standen auf seiner Stirn. Er bebte. Sein Schwanz zitterte. Er wartete auf den Aufzug, stieg ein und fuhr in den 15. Stock. Auf dem Weg kam ihm die rettende Idee. Dies war ein Studentenwohnheim. Er blieb im Aufzug stehen. Er band sich eine Schutzmaske aus Papier um.
    Nach einer kurzen Zeit stieg sie ein. Wollte sie zum Cafe auf dem Dach. Zum Ausgang unten. Zwischen zwei Stockwerken hielt er den Aufzug an und konsumierte sie.

    6

    Es war schon fast hell, als er aus der letzten Diskothek kam, die Knochen bebten vom Entzug. Auf der gegenüberliegenden Straßenseite waren zwei Mädchen. Die eine lag auf dem Rücken auf dem Asphalt. Die andere stand daneben. Gibt es ein Problem fragte er und sein Herz schlug bereits höher. Er schaffte es, die beiden in seine Höhle zu bugsieren. Stunden später lag die eine bewußtlos in einer verrenkten Seitenlage nackt in der Ecke. Ihre blonden kurzen Haare war von ihrem eigenen Erbrochenen verklebt. Die andere hatte ihren Kopf auf seinem Schoß. Sie war verpackt mit Plastiklaschen und Paketklebeband, sonst trug sie nichts. Sie zitterte stark, doch ihre vor Schreck geweiteten Augen waren stumpf geworden. Er saß vollkommen zufrieden und beruhigt auf der Matratze und streichelte das Bündel auf seinem Schoß. Nachdem er sich Gedanken über die Entsorgung gemacht hatte, zückte er sein kleines Büchlein und trug die Dosis ein.

    7

    Er saß in seinem anderen Zimmer auf dem Boden vor dem Computer. Das einzige Licht kam aus dem Monitor. Er codete. Er konnte es, während die Dosis noch wirkte.




    added on the 2009-06-09 19:41:38 by torus torus
    ==Phrack Inc.==

    Volume 0x0d, Issue 0x42, Phile #0x0D of 0x11

    |=----------------------------------------------------------------------=|
    |=---------=[ Hacking the Cell Broadband Engine Architecture ]=---------=|
    |=-------------------=[ SPE software exploitation ]=--------------------=|
    |=----------------------------------------------------------------------=|
    |=--------------=[ By BSDaemon ]=----------=|
    |=--------------=[ <bsdaemon *noSPAM* risesecurity_org> ]=----------=|
    |=----------------------------------------------------------------------=|

    "There are two ways of
    constructing a software design.
    One way is to make it so simple
    that there are obviously no
    deficiencies. And the other way
    is to make it so complicated that
    there are no obvious deficiencies"
    - C.A.R. Hoare

    ------[ Index

    1 - Introduction
    1.1 - Paper structure

    2 - Cell Broadband Engine Architecture
    2.1 - What is Cell
    2.2 - Cell History
    2.2.1 - Problems it solves
    2.2.2 - Basic Design Concept
    2.2.3 - Architecture Components
    2.2.4 - Processor Components
    2.3 - Debugging Cell
    2.3.1 - Linux on Cell
    2.3.2 - Extensions to Linux
    2.3.2.1 - User-mode
    2.3.2.2 - Kernel-mode
    2.3.3 - Debugging the SPE
    2.4 - Software Development for Linux on Cell
    2.4.1 - PPE/SPE hello world
    2.4.2 - Standard Library Calls from SPE
    2.4.3 - Communication Mechanisms
    2.4.4 - Memory Flow Control (MFC) Commands
    2.4.5 - Direct Memory Access (DMA) Commands
    2.4.5.1 - Get/Put Commands
    2.4.5.2 - Resources
    2.4.5.3 - SPE 2 SPE Communication

    3 - Exploiting Software Vulnerabilities on Cell SPE
    3.1 - Memory Overflows
    3.1.1 - SPE memory layout
    3.1.2 - SPE assembly basics
    3.1.2.1 - Registers
    3.1.2.2 - Local Storage Addressing Mode
    3.1.2.3 - External Devices
    3.1.2.4 - Instruction Set
    3.1.3 - Exploiting software vulnerabilities in SPE
    3.1.3.1 - Avoiding Null Bytes
    3.1.4 - Finding software vulnerabilities on SPE

    4 - Future and other uses

    5 - Acknowledgements

    6 - References

    7 - Notes on SDK/Simulator Environment

    8 - Sources


    ------[ 1 - Introduction

    This article is all about Cell Broadband Architecture Engine [1], a new
    hardware designed by a joint between Sony [2], Toshiba [3] and IBM [4].

    As so, lots of architecture details will be explained, and also many
    development differences for this platform.

    The biggest differentiator between this article and others released about
    this subject, is the focus on the architecture exploitation and not the
    use of the powerful processor resources to break code [5] and of course,
    the focus in the differentiators of the architecture, which means the SPU
    (synergestic processor unit) and not in the core (PPU - power processor
    unit) [6], since the core is a small-modified power processor (which
    means, all shellcodes for Linux on Power will also works for the core and
    there is just small differences in the code allocation and stuffs like
    that).

    It's important to mention that everything about Cell tries to focus in the
    Playstation3 hardware, since it's cheap and widely deployed, but there is
    also big machines made with this processor [7], including the #1 in the
    list of supercomputers [8].



    ---[ 1.1 - Paper structure

    The idea of this paper is to complete the studies about Cell, putting all
    the information needed to do security research, focused in software
    exploitation for this architecture together.

    For that, the paper have been structured in two important portions:

    Chapter 2 will be all about the Cell Architecture and how to develop for
    this architecture. It includes many samples and explains the
    modifications done to Linux in order to get the best from this
    architecture. Also, it gives the knowledge needed in order to go further
    in software exploitation for this arch. Chapter 3 is focused in the
    exploitation of the SPU processor, showing the simple memory layout it has
    and how to write a shellcode for the purpose of gaining control over an
    application running inside the SPU.


    ------[ 2 - Cell Broadband Engine Architecture

    From the IBM Research [9]: "The Cell Architecture grew from a challenge
    posed by Sony and Toshiba to provide power-efficient and cost-effective
    high-performance processing for a wide range of applications, including
    the most demanding consumer appliance: game consoles. Cell - also known as
    the Cell Broadband Engine Architecture (CBEA) - is an innovative solution
    whose design was based on the analysis of a broad range of workloads in
    areas such as cryptography, graphics transform and lighting, physics,
    fast-Fourier transforms (FFT), matrix operations, and scientific
    workloads. As an example of innovation that ensures the clients' success,
    a team from IBM Research joined forces with teams from IBM Systems
    Technology Group, Sony and Toshiba, to lead the development of a novel
    architecture that represents a breakthrough in performance for consumer
    applications. IBM Research participated throughout the entire development
    of the architecture, its implementation and its software enablement,
    ensuring the timely and efficient application of novel ideas and
    technology into a product that solves real challenges."

    It's impossible to not get excited with this. A so 'powerful' and
    versatile architecture, completely different from what we usually seen is
    an amazing stuff to research for software vulnerabilities. Also, since
    it's supposed to be widely deployed, there will be an infinite number of
    new vulnerabilities coming on in the near future. I wanted to exploit
    those vulnerabilities.


    ---[ 2.1 - What is Cell

    As must be already clear to the reader, I'm not talking about phones here.
    Cell is a new architecture, which cames to solve some of the actual
    problems in the computer industry.

    It's compatible with a well-known architecture, which are the Power
    Architecture, keeping most of it's advantages and solving most of it's
    problems (if you cannot wait until know what problems, go to 2.2.1
    section).


    ---[ 2.2 - Cell History

    The focus of this section is just to give a timeline vision for the
    reader, not been detailed at all.

    The architecture was born from a joint between IBM, Sony and Toshiba,
    formed in 2000.

    They opened a design center in March 2001, based in Austin, Texas (USA).

    In the spring of 2004, a single Cell BE became operational. In the summer
    of the same year, a 2-way SMP version was released.

    The first technical disclosures came just in February 2005, with the
    simulator [10] and open-source SDK [11] (more on that later) been released
    in November of the same year. In the same month, Mercury started to sell
    Cell (yeah, sell Cell sounds funny) machines.

    Cell Blades was announced by IBM in February of 2006. The SDK 1.1 was
    released in July of the same year, with many improvements. The latest
    version is 3.1.


    ---[ 2.2.1 - Problems it solves

    The computer technology have been evolving along the years, but always
    suffering and trying to avoid some barriers.

    Those barriers are physically impossible to be bypassed and that's why the
    processor clock stopped to grow and multi-core architectures been focused.

    Basically we have three big walls (barriers) to the speedy grow:
    - Power wall
    It's related to the CMOS technology limits and the hard limit to
    the acceptable system power

    - Memory wall
    Many comparisons and improvements trying to avoid the DRAM latency
    when compared to the processor frequency

    - Frequency wall
    Diminishing return from deeper pipelines

    For a new architecture to work and be widely deployed, it was also
    important to keep the investments in software development.

    Cell accomplish that being compatible with the 64 bits Power Architecture,
    and attacks the walls in the following ways:

    - Non-homogeneous coherent multi-processor and high design
    frequency at a low operating voltage with advanced power
    management attacks the 'power wall'.
    - Streaming DMA architecture and three-level memory model (main
    storage, local storage and register files) attacks the 'memory
    wall'.
    - Non-homogeneous coherent multi-processor, highly-optimized
    implementation and large shared register files with software controlled
    branching to allow deeper pipelines attacks the 'frequency wall'.

    It have been developed to support any OS, which means it supports
    real-time operating system as well non-real time operating systems.

    ---[ 2.2.2 - Basic Design Concept

    The basic concept behind cell is it's asymmetric multi-core design. That
    permits a powerful design, but of course requires specific-developed
    applications to achieve the most of the architecture.

    Knowing that, becomes clear that the understanding of the new component,
    which is called SPU (synergistic processor unit) or SPE (synergistic
    processor element) proofs to be essential - see the next section for a
    better understanding of the differences between SPU and SPE.


    ---[ 2.2.3 - Architecture Components

    In cell what we have is a core processor, called Power Processor Element
    (PPE) which control tasks and synergistic processor elements (SPEs) for
    data-intensive processing.

    The SPE consists of the synergistic processor unit (SPU), which are a
    processor itself and the memory flow control (MFC), responsible for the
    data movements and synchronization, as well for the interface with the
    high-performance element interconnect bus (EIB).

    Communications with the EIB are done in a 16B/cycle, which means that each
    SPU is interconnected at that speedy with the bus, which supports
    96B/cycle.

    Refer to the picture architecture-components.jpg in the directory images
    of the attached file for a visual of the above explanation.

    ---[ 2.2.4 - Processor Components

    As said, the Power Processor Element (PPE) is the core processor which
    control tasks (scheduling). It is a general purpose 64 bit RISC processor
    (Power architecture).

    It's 2-way hardware multithreaded, with a L1: 32KB I and D caches and L2:
    512KB cache.

    Has support for real-time operations, like locking the L2 cache and the
    TLB (also it supports managed TLB by hardware and software). It has
    bandwidth and resource reservation and mediated interrupts.

    It's also connected to the EIB using a 16B/cycle channel (figure
    processor-components.jpg).

    The EIB itself supports four 16 bytes data rings with simultaneous
    transfers per ring (it will be clarified later).

    This bus supports over 100 simultaneous transactions achieving in each bus
    data port more than 25.6 Gbytes/sec in each direction.

    On the other side, the synergistic processor element is a simple RISC
    user-mode architecture supporting dual-issue VMX-like, graphics SP-float
    and IEEE DP-float.

    Important to note that the SPE itself has dedicated resources: unified 128
    x 128 bit register files and 256KB local storage. Each SPE has a
    dedicated DMA engine, supporting 16 requests.

    The memory management on this architecture simplified it's use, with the
    local storage of the SPE being aliased into the PPE system memory (figure
    processor-components2.jpg).

    MFC in the SPE acts as the MMU providing controls over the SPE DMA access
    and it's compatible with the PowerPC Virtual Memory layout and is software
    controllable using PPE MMIO.

    DMA access supports 1,2,4,8...n*16 bytes transfer, with a maximum of 16 KB
    for I/O, and with two different queues for DMA commands: Proxy & SPU
    (more on this later).

    EIB is also connected in a broadband interface controller (BIC). The
    purpose of this controller is to provide external connectivity for
    devices. It supports two configurable interfaces (60 GB/s) with a
    configurable number of bytes, coherent (BIF) and/or I/O (IOIFx) protocols,
    using two virtual channels per interface, and multiple system
    configurations.

    The memory interface controller (MIC) is also connected to the EIB and is
    a Dual XDR controller (25.6 GB/s) with ECC and suspended DRAM support
    (figure processor-components3.jpg).

    Still are missing two more components: The internal interrupt controller
    (IIC) and the I/O Bus Master Translation (IOT) (figure
    processor-components4.jpg).

    The IIC handles the SPE interrupts as well as the external interrupts and
    interrupts comming from the coherent interconnect and the IOIF0 and IOIF1.
    It is also responsible for the interrupt priority level control and for
    the interrupt generation ports for IPI. Note that the IIC is duplicated
    for each PPE hardware thread.

    IOT translates bus addresses to system real addresses, supporting two
    level translations:
    - I/O segments (256 MB)
    - I/O pages (4K, 64K, 1M, 16M bytes)

    Interesting is the resource of I/O device identifier per page for LPAR use
    (blades) and IOST/IOPT caches managed by software and hardware.


    ---[ 2.3 - Debugging Cell

    As the bus is a high-speedy circuit, it's really difficult to debug the
    architecture and better seen what is going on.

    For that, and also to made it easy to develop software for Cell, IBM
    Research developed a Cell simulator [10] in which you may run Linux and
    install the software development kit [11].

    The IBM Linux Technology Center brazilian team developed a plugin for
    eclipse as an IDE for the debugger and SDK. Putting it all together is
    possible to have the toolkit installed in a Linux machine, running the
    frontends for the simulator and for the SDK. The debugging interface is
    much better using this frontends. Anyway, it's important to notice that
    it's just a frontend for the normal and well know linux tools with
    extended support to Cell processor (GDB and GCC).

    ---[ 2.3.1 - Linux on Cell

    Linux on cell is an open-source git branch and is provided in the PowerPC
    64 kernel line.

    It started in the 2.6.15 and is evolving to support many new features,
    like the scheduling improvements for the SPUs (actually it can be
    preempted, and my big friend Andre Detsch who reviewed this article was
    one of the biggest contributors to create an stable code here).

    On Linux it added heterogeneous lwp/thread model, with a new SPE thread
    model (really similar to the pthreads library as we will see later),
    supporting user-mode direct and indirect SPE access, full-preemptive SPE
    context management and for that, spe_ptrace() was create and it's support
    added to GDB, spe_schedule() for thread to physical spe assigment (it is
    not anymore FIFO - run until completion).

    As a note, the SPE threads shares it's address space with the parent PPE
    process (using DMA), demand paging for SPE access and shared hardware page
    table with PPE.

    An implementation detail is the PPE proxy thread allocated for each SPE to
    provide a single namespace for both PPE and SPE and assist in SPE
    initiated C99 and Posix library services.

    All the events, error and signal handling for SPEs are done by the parent
    PPE thread.

    The ELF objects for SPE are wrapped into PPE objects with an extended GLD.


    ---[ 2.3.2 - Extensions to Linux

    Here I'll try to provide some details for Linux running under a Cell
    Hardware. The base hardware used for this reference is a Playstation 3,
    which has 8 SPUs, but one is reserved with the purpose of redundancy and
    another one is used as hypervisor for a custom OS (in this case, Linux).

    All the details are valid for any Linux on Cell and we will provide an
    top-down view approach.

    ---[ 2.3.2.1 - User-mode

    Cell supports both power 32 and 64 bits applications, as well as 32 and 64
    cell workloads. It has different programming modes, like RPC, devices
    subsystems and direct/indirect access.

    As already said, it has heterogeneous threads: single SPU, SPU groups and
    shared memory support.

    It runs over a SPE management runtime library, with 32 and 64 bits. This
    library interacts with the SPUFS filesystem (/spu/thread#/) in the
    following ways:
    * Open, close, read, write the files:
    - mem
    This file provides access to the local storage

    - regs
    Access to the 128 register of 128 bits each

    - mbox
    spe to ppe mailbox

    - liox
    spe to ppe interrupt mailbox

    - xbox_stat
    Get the mailbox status

    - signal1
    Signal notification acess

    - signal2
    Signal notification acess

    - signalx_type
    Signal type

    - npc
    Read/write SPE next program counter (for debugging)

    - fpcr
    SPE floating point control/status register

    - decr
    SPE decrementer

    - decr_status
    SPE decrementer status

    - spu_tag_mask
    Access tag query mask

    - event_mask
    Access spe event mask

    - srr0
    Access spe state restore register 0


    * open, close mmap the files:
    - mem
    Program State access of the Local Storage

    - signal1
    Direct application access to signal 1

    - signal2
    Direct application access to signal 2

    - cntl
    Direct application access to SPE controls, DMA queues and
    mailboxes

    The library also provides SPE task control system calls (to interact with
    the SPE system calls implemented in kernel-mode), which are:
    - sys_spu_create_thread
    Allocates a SPE task/context and creates a directory in SPUFS

    - sys_spu_run
    Activates a SPU task/context on a physical SPE and
    blocks in the kernel as a proxy thread to handle the events
    already mentioned

    Some functions provided by the library are related to the management of
    the spe tasks, like spe create group, create thread, get/set affinity,
    get/set context, get event, get group, get ls, get ps area, get threads,
    get/set priority, get policy, set group defaults, group max, kill/wait,
    open/close image, write signal, read in_mbox, write out_mbox, read mbox
    status.


    Obviously the standard 32 and 64 bits powerpc ELF (binary) interpreters,
    it is provided a SPE object loader, responsible for understand the
    extension to the normal objects already mentioned and for initiate the
    loading of the SPE threads.

    Going down, we have the glibc and other GNU libraries, both supporting 32
    and 64 bits.


    ---[ 2.3.2.2 - Kernel-mode

    The next layer is the normal system-call interface, where we have the SPU
    management framework (through special files in the spufs) and
    modifications in the exec* interface, in a 64bit kernel.

    This modification is done through a special misc format binary, called SPU
    object loader extension.

    Of course there is other kernel extensions, the SPUFS filesystem, which
    provides the management interface and the SPU allocation, scheduling and
    dispatch.

    Also, we do have the Cell BE architecture specific code, supporting multi
    and large pages, SPE event & fault handling, IIC and IOMMU.

    Everything is controlled by a hypervisor, since Linux is what is called a
    custom OS when running in a Playstation3 hardware (the hypervisor is
    responsible for the protection of the 'secret key' of the hardware and
    knowing how to exploit SPU vulnerabilities plus some fuzzing on the
    hypervisor may be the needed knowledge to break the game protection copy
    in this hardware).


    ---[ 2.3.3 - Debugging the SPE

    The SDK for Linux on Cell provides good resources for Debugging and better
    understanding of what is going on.

    It's important to note the environment variables that control the
    behaviour of the system.

    So, if you set the SPU_INFO, for example, the spe runtime library will
    print messages when loading a SPE ELF executable (see above).

    ---------- begin output ----------
    # export SPU_INFO=1
    # ./test
    Loading SPE program: ./test
    SPU LS Entry Addr : XXX
    ---------- end output ----------

    And it will also print messages before starting up a new SPE thread, like:

    ---------- begin output ----------
    Starting SPE thread 0x..., to attach debugger use: spu-gdb -p XXX
    ---------- end output ----------

    When planning to use the spu-gdb to debug a SPU thread, it's important to
    remember the SPU_DEBUG_START environment variable, which will include
    everything provided by the SPU_INFO and will stop the thread until a
    debugger is attached or a signal is received.

    Since each SPU register can hold multiple fixed (or floating) point values
    of different sizes, for GDB is provided a data structure that can be
    accessed with different formats. So, specifying the field in the data
    structure, we can update it using different sizes as well:

    ---------- begin output ----------
    (gdb) ptype $r70
    type = union __gdb_builtin_type_vec128 {
    int128_t uint128;
    float v4_float[4];
    int32_t v4_int32[4];
    int16_t v8_int16[8];
    int8_t v16_int8[16];
    }

    (gdb) p $r70.uint128
    $1 = 0x00018ff000018ff000018ff000018ff0
    (gdb) set $r70.v4_int[2]=0xdeadbeef
    (gdb) p $r70.uint128
    $2 = 0x00018ff000018ff0deadbeef00018ff0
    ---------- end output ----------

    To permit you to better understand when the SPU code starts the execution
    and follow it gdb also included an interesting option:


    ---------- begin output ----------
    (gdb) set spu stop-on-load
    (gdb) run
    ...
    (gdb) info registers
    ---------- end output ----------

    Another important information for debugging your code is to understand the
    internal sizes and be prepared for overlapping. Useful information can
    be get using the following fragment code inside your spu program (careful:
    It's not freeing the allocated memory).

    --- code ---
    extern int _etext;
    extern int _edata;
    extern int _end;

    void meminfo(void)
    {
    printf("\n&_etext: %p", &_etext);
    printf("\n&_edata: %p", &_edata);
    printf("\n&_end: %p", &_end);
    printf("\nsbrk(0): %p", sbrk(0));
    printf("\nmalloc(1024): %p", malloc(1024));
    printf("\nsbrk(0): %p", sbrk(0));
    }
    --- end code ---

    And of course you can also play with the GCC and LD arguments to have more
    debugging info:

    --- code ---
    # vi Makefile
    CFLAGS += -g
    LDFLAGS += -Wl,-Map,map_filename.map
    --- end code ---



    ---[ 2.4 - Software Development for Linux on Cell

    In this chapter I will introduce the inners of the Cell development,
    giving the basic knowledge necessary to better understand the further
    chapters.

    ---[ 2.4.1 - PPE/SPE hello world

    Every program in Cell that uses the SPEs needs to have at least two source
    codes. One for the PPE and another one for the SPE.

    Following is a simple code to run on the SPE (it's also in the attached
    tar file :

    --- code ---
    #include <stdio.h>

    int main(unsigned long long speid, unsigned long long argp, unsigned long long envp)
    {
    printf("\nHello World!\n");
    return 0;
    }
    --- end code ---

    The Makefile for this code will look like:


    --- code ---
    PROGRAM_spu = hello_spu
    LIBRARY_embed = hello_spu.a
    IMPORTS = $(SDKLIB_spu)/libc.a
    include ($TOP)/make.footer
    --- end code ---

    Of course it looks like any normal code. The PPE as already explained is
    the responsible for the creation of the new thread and allocation in the
    SPE:

    --- code ---
    #include <stdio.h>
    #include <libspe.h>

    extern spe_program_handle_t hello_spu;

    int main(void)
    {
    int speid, status;

    speid=spe_create_thread(0, &hello_spu, NULL, NULL, -1, 0);
    spe_wait(speid, &status, 1);
    return 0;
    }
    --- end code ---

    With the following Makefile:

    --- code ---
    DIRS = spu
    PROGRAM_ppu = hello_ppu
    IMPORTS = ../spu/hello_spu.a -lspe
    include $(TOP)/make.footer
    --- end code ---

    The reader will notice that the speid in the PPE program will be the same
    value as the speid in the main of the SPE.

    Also, the arguments passed to the spe_create_thread() are the ones
    received by the SPE program when running (argp and envp equals to NULL in
    our sample).

    Important to remember that when compiled this program will generate a
    binary in the spu directory, called hello_spu and another one in the root
    directory of this example called hello_ppu, which CONTAINS embedded the
    hello_spu.


    ---[ 2.4.2 - Standard Library Calls from SPE

    When the SPE program needs to use any standard library call, like for
    example, printf or exit, it has to call back to the PPE main thread.

    It uses a simple stop-and-signal assembly instruction with standardized
    arguments value (important to remember that since it's needed in
    shellcodes for SPE).

    That value is returned from the ioctl call and the user thread must react
    to that. This means copying the arguments from the SPE Local Storage,
    executing the library call and then calling ioctl again.

    The instruction according to the manual:
    "stop u14 - Stop and signal. Execution is stopped, the current
    address is written to the SPU NPC register, the value u14 is
    written to the SPU status register, and an interrupt is sent to
    the PPU."

    This is a disassembly output of the hello_spu program:

    ---------- begin output ----------
    # spu-gdb ./hello_spu
    GNU gdb 6.3
    Copyright 2004 Free Software Foundation, Inc.
    GDB is free software, covered by the GNU General Public License, and you are
    welcome to change it and/or distribute copies of it under certain conditions.
    Type "show copying" to see the conditions.
    There is absolutely no warranty for GDB. Type "show warranty" for details.
    This GDB was configured as "--host=powerpc64-unknown-linux-gnu --target=spu"...
    (gdb) disassemble main
    Dump of assembler code for function main:
    0x00000170 <main+0>: ila $3,0x340 <.rodata>
    0x00000174 <main+4>: stqd $0,16($1)
    0x00000178 <main+8>: nop $127
    0x0000017c <main+12>: stqd $1,-32($1)
    0x00000180 <main+16>: ai $1,$1,-32
    0x00000184 <main+20>: brsl $0,0x1a0 <puts> # 1a0
    0x00000188 <main+24>: ai $1,$1,32 # 20
    0x0000018c <main+28>: fsmbi $3,0
    0x00000190 <main+32>: lqd $0,16($1)
    0x00000194 <main+36>: bi $0
    0x00000198 <main+40>: stop
    0x0000019c <main+44>: stop
    End of assembler dump.
    (gdb)
    ---------- end output ----------


    ---[ 2.4.3 - Communication Mechanisms

    The architecture offers three main communications mechanism:
    - DMA
    Used to move data and instructions between main storage and
    a local storage. SPEs rely on asyncronous DMA transfers to hide
    memory latency and transfer overhead by moving information in
    parallel with SPU computation.

    - Mailbox
    Used for control communications between a SPE and the
    PPE or other devices. Mailboxes holds 32-bit messages. Each
    SPE has two mailboxes for sending messages and one mailbox for
    receiving messages.

    - Signal Notification
    Used for control communications from PPE or
    other devices. Signal notification (also known as signalling)
    uses 32-bit registers that can be configured for
    one-sender-to-one-receiver signalling or
    many-senders-to-one-receiver signalling.

    All three are controlled and implemented by the SPE MFC and it's
    importance is related to the way the vulnerable program will receive it's
    input.

    ---[ 2.4.4 - Memory Flow Control (MFC) Commands

    This is the main mechanism for the SPE to access the main storage and
    maintain syncronization with other processors and devices in the system.

    MFC commands can be issued either by the SPE itself, or by the processor
    and other devices, as follow:
    - A code running on the SPU issue a MFC command by executing a
    series of writes and/or reads using channel instructions.
    - A code running on the PPU or any other device issue a MFC
    command by performing a serie of stores and/or loads to
    memory-mapped I/O (MMIO) registers in the MFC.

    The MFC commands are then queued in one of those independent queues:
    - MFC SPU Command Queue - For channel-initiated commands by the
    associated SPU
    - MFC Proxy Command Queue - For MMIO-initiated commands by the PPE
    or other devices.


    ---[ 2.4.5 - Direct Memory Access (DMA) Commands

    The MFC commands that transfers data are referred as DMA commands. The
    transfer direction for DMA commands are based on the SPE point of view:
    - Into a SPE (from main storage to the local storage) -> get
    - Out of a SPE (from local storage to the main storage) -> put

    ---[ 2.4.5.1 - Get/Put Commands

    DMA get from the main memory to the local storage:
    (void) mfc_get (volatile void *ls, uint64_t ea, uint32_t size,
    uint32_t tag, uint32_t tid, uint32_t rid)

    DMA put into the main memory from the local storage:
    (void) mfc_put (volatile void *ls, uint64_t ea, uint32_t size,
    uint32_t tag, uint32_t tid, uint32_t rid)

    To guarantee the synchronization of the writes to the main memory, there
    is the options:
    - mfc_putf: the 'f' means fenced, or, that all commands executed
    before within the same tag group must finish first, later ones
    could be before

    - mfc_putb: the 'b' here means barrier, or, that the barrier
    command and all commands issued thereafter are NOT executed
    until all previously issued commands in the same tag group have
    been performed


    ---[ 2.4.5.2 - Resources

    For DMA operations the system uses DMA transfers with variable length
    sizes (1, 2, 4, 8 and n*16 bytes (n an integer, of course). There is a
    maximum of 16 KB per DMA transfer and 128b aligments offer better
    performance.

    The DMA queues are defined per SPU, with 16-element queue for
    SPU-initiated requests and 8-element queue for PPU-initiated requests.
    The SPU-initiated request has always a higher priority.

    To differentiate each DMA command, they receive a tag, with a 5-bit
    identifier. Same identifier can be applied to multiple commands since
    it's used for polling status or waiting on the completion of the DMA
    commands.

    A great feature provided is the DMA lists, where a single DMA command can
    cause execution of a list of transfers requests (in local storage). Lists
    implements scatter-gather functions and may contain up to 2K transfer
    requests.

    ---[ 2.4.5.3 - SPE 2 SPE Communication

    An address in another SPE local storage is represented as a 32-bit
    effective address (global address).

    SPE issuing a DMA command needs a pointer to the other SPE's local
    storage. The PPE code can obtain effective address of an SPE's local
    storage:

    --- code ---
    #include <libspe.h>

    speid_t speid;
    void *spe_ls_addr;
    spe_ls_addr=spe_get_ls(speid);
    --- end code ---

    This permits the PPE to give to the SPEs each other local addresses and
    control the communications. Vulnerabilities may arise don't matter what
    is the communication flow, even without involving the PPE itself.

    Follow is a simple DMA demo program between PPE and SPE (see the attached
    file for the complete version) - This program will send an address in the
    PPE to the SPE through DMA:

    --- PPE code ---
    information_sent is[1] __attribute__ ((aligned 128)));
    spe_git_t gid;
    int * pointer=(int *)malloc(128);

    gid=spe_create_group(SCHED_OTHER, 0, 1);

    if (spe_group_max(gid) < 1 ) {
    printf("\nOps, there is no free SPE to run it...\n");
    exit(EXIT_FAILURE);
    }

    is[0].addr = (unsigned int) pointer;

    /* Create the SPE thread */
    speid=spe_create_thread (gid, &hello_dma, (unsigned long long *) &is[0], NULL, -1, 0);

    /* Wait for the SPE to complete */
    spe_wait(speids[0], &status[0], 0);

    /* Best pratice: Issue a sync before ending - This is good for us ;) */
    __asm__ __volatile__ ("sync" : : : "memory");
    --- end code ---


    --- SPE code ---
    information_sent is __attribute__ ((aligned 128)));

    int main(unsigned long long speid, unsigned long long argp, unsigned long long envp)
    {
    /* Where:
    is -> Address in local storage to place the data
    argp -> Main memory address
    sizeof(is) -> Number of bytes to read
    31 -> Associated tag to this DMA (from 0 to 31)
    0 -> Not useful here (just when using caching)
    0 -> Not useful here (just when using caching)
    */
    mfc_get(&is, argp, sizeof(is), 31, 0, 0);

    mfc_write_tag_mask(1<<31); /* Always 1 left-shifted the value of your tag mask */

    /* Issue the DMA and wait until completion */
    mfc_read_tag_status_all();
    }
    --- end code ---

    And now between two SPEs (also for the complete code, please refer to the
    attached sources):

    --- PPE code ---
    speid_t speid[2]
    speid[0]=spe_create_thread (0, &dma_spe1, NULL, NULL, -1, 0);
    speid[1]=spe_create_thread (0, &dma_spe2, NULL, NULL, -1, 0);

    for (i=0; i<2; i++) local_store[i]=spe_get_ls(speid[i]); /* Get local storage address */

    for (i=0; i<2; i++) spe_kill(speid[i], SIGKILL); /* Send SIGKILL to the SPE
    threds */
    --- end code ---

    --- SPE code ---
    /* Write something to the PPE */
    spu_write_out_mbox(buffer);

    /* Read something from the PPE */
    pointer = spu_read_in_mbox();

    /* DMA interface */
    mfc_get(buffer, pointer, size, tag, 0, 0);
    wait_on_mask(1<<tag);

    /* DMA something to the second SPE */
    mfc_put(buffer, local_store[1], size, tag, 0, 0);
    wait_on_mask(1<<tag);

    /* Notify the PPE */
    spu_write_out_mbox(1);
    --- end code ---

    ------[ 3 - Exploiting Software Vulnerabilities on Cell SPE

    I love the architecture manuals and the engineers and the way they talk
    about really dumb design choices:

    "The SPU Local Store has no memory protection, and memory access wraps
    from the end of the Local Store back to the beginning. An SPU program is
    free to write anywhere in the Local Store including its own instruction
    space. A common problem in SPU programming is the corruption of the SPU
    program text when the stack area overflows into the program area. This
    problem typically does not become apparent until some later point in the
    program execution when the program attempts to execute code in area that
    was corrupted, which typically results in illegal instruction exception.
    Even with a debugger it can be difficult to track down this type of
    problem because the cause and effect can occur far apart in the program
    execution. Adding printf's just moves failure point around".

    ---[ 3.1 - Memory Overflows

    In the aforementioned memory design of the SPU is already cleaver that
    when an attacker controls the overwrite size it's really easy to exploit a
    SPU vulnerability, just replacing the original program .text with the
    attacker's one.

    It's important to note that the SPU interrupt facility can be configured
    to branch to an interrupt handler at address 0 if an external condition is
    true (bisled - branch indirect and set link if external data is the
    instruction used to check if there is external data available). Since the
    memory layout loops around, it's always possible to overwrite this handler
    if it's been used.

    Another important note is the fact that instructions on memory MUST be
    aligned on word boundaries.

    There is instruction and data caches for the local storage (depending on
    the implementation details), so it's important to assure:
    - You are overflowing a large enough amount of data to avoid
    caching
    - You are not using a self-modifying shellcode unless you issue
    the sync instruction (see [13] for references)


    ---[ 3.1.1 - SPE memory layout

    The memory layout for the SPE looks like:

    ------------------------ -> 0x3FFFF
    SPU ABI Reserved Usage
    ------------------------ | Stack grows from the
    Runtime Stack | higher addresses to
    ------------------------ | the lower addresses.
    Global Data |
    ------------------------ \/
    .Text
    ------------------------ -> 0x00000

    For the purpose of test your application, it's really interesting to use the
    'size' application:

    ---------- begin output ----------
    # size hello_spu
    text data bss dec hex filename
    1346 928 32 2306 902 hello_spu
    ---------- end output ----------



    ---[ 3.1.2 - SPE assembly basics

    It's important in order to develop a shellcode to understand the
    differences in the SPE assembly when comparing to PowerPC.

    The SPE uses risc-based assembly, which means there is a small set of
    instructions and everything in the SPE runs in user-mode (there is no
    kernel-mode for the SPE). That said we need to remember there is no
    system-calls, but instead there is the PPE calls (stop instructions).

    It is also a big endian architecture (keep that in mind while reading the
    following sections).

    This architecture provides many ways to avoid branches in the code for
    maximum efficiency. Since it's not a real problem while exploiting
    software, I'll just avoid to talk about and will also avoid to talk about
    SIMD instructions. For more informations on that refer to the SPU
    Instruction Set Architecture document [12].


    ---[ 3.1.2.1 - Registers

    I already explained a little about the way the architecture works and in
    this section I'll just include what is the available register set and how
    to use it .

    The SPE does not define a conditional register, so the comparison
    operations will set results that are either 0 (false) or 1 (true) with the
    same width as the operands been tested. This results are used to do
    bitwise masking, instruction selection or conditional branching.

    As any other platform, there is general purposes registers and special
    purpose registers in the SPE:
    - General Purpose Registers (0-127) Used in different ways by the
    instructions. In the second word of R1 you have the information
    about the amount of free space in the stack (the room between
    end of the heap and the start of the stack).

    - Special Purpose Registers
    The SPE also supports 128 special-purpose registers. Some
    interesting ones:
    * SRR0 - Save and Restore Register 0 - Holds the address
    used by the interrupt return (iret) instruction
    * LR - Link Register - All branch instructions that set
    the link register will force the address of the next
    instruction to be loaded on this register
    * CTR - Count Register - Usually it's used to hold a loop
    counter (like the loop instruction and %ecx register in
    intel x86 architecture)
    * CR - Condition Register - Used to perform conditional
    comparisons


    To move data between Special Purpose Registers and General Purpose
    Registers we have the instructions
    * mtspr (move to special purpose register) mfspr (move from
    * special purpose register)


    ---[ 3.1.2.2 - Local Storage Addressing Mode

    In order to address information to/from Local Storage the instructions
    uses the following structure:
    Instruction_Opcode l10_field RA_field RT_field
    8-bit 10-bit 7-bit 7-bit

    Where: The signed value of the l10 field is appended with 4 zeros and then
    added to the preferred slot in the RA, forcing the 4-rightmost bits of the
    sum to zero. After, the 16 bytes of the local storage address are
    inserted in the RT field.

    Preferred slot for the architecture point of view are the leftmost
    4 bytes (not bits).

    Important to note here that the IBM convention specifies that:
    l10 means a 10-bit immediate value
    RA means a general purpose register to be used as
    source/destination
    RT means a general purpose register to be used as destination
    (target)

    Knowing that makes it easier to understand why the Local Storage Address
    Space is limited to 4 GB.

    The actual size of the Local Storage can be viewed accessing the LSLR
    (local storage limit register). All effective address are ANDed with the
    value in the LSLR before used.

    ---[ 3.1.2.3 - External Devices

    The SPU can send/receive data to/from external devices using the channel
    interface. The channel instructions uses quadwords (128bits) to transfer
    data to/from general purpose registers and the channel device (which
    supports 128 channels).


    ---[ 3.1.2.4 - Instruction Set

    Here are some useful instructions to be used while developing a shellcode
    for the SPE.

    Instruction Operands Description
    Sample
    -------------------------------------------------------------------------
    lqd (load quadword) rt,symbol(ra) load a value (16 bytes)
    from Local Storage (pointed by RA to the general purpose register RT)
    lqd $0, 16($1)

    stqd (store quadword) rt,symbol(ra) the contents of the
    register (RT) are stored at the local storage address pointed by RA
    stqd $0, 16($1)

    ilh (immediate load halfword) rt,symbol the value of l16 is placed
    in register RT
    ilh $0, 0x1a0

    il (immediate load word) rt, symbol the value of l16 is
    expanded to 32bits replicating the leftmost bit and then written to the RT
    il $0, 0x1a0

    nop (no operation) rt this instruction uses a
    false RT and nothing is changed
    nop $127

    ila (immediate load address) rt, symbol the value of the l18 is
    placed in the rightmost 18bits of RT (the remaining bits of RT are zeroed)
    ila $3, 0x340

    a (add word) rt,ra,rb the operand on register ra
    is added to the operand on register rb and the result is written to RT
    a $0, $1, $2

    ai (add word immediate) rt,ra,value the value (l10 field) is
    added to the operand in ra and the result written to RT
    ai $1, $1, -32

    brsl (branch relative and set link) rt,symbol execution proceeds to the
    target instruction and a link register is set (the symbol is a l16 type
    and it is extended to the rigth with two 0 bits) - The address of the
    current instruction is added to the symbol address for the branch. The
    address of the next instruction is written to the preferred byte of the RT
    register.
    brsl $0, 0x1a0

    fsmbi (form select mask for bytes immediate) rt,symbol the symbol is a
    l16 value used to create a mask in the register RT copying eight times
    each bit. Bits in the operand are related to bytes in the result in a
    left-to-right correspondence. fsmbi $3, 0

    bi (branch indirect) ra execution proceeds to the
    preferred slot of RA. The right two bits in the RA are ignored (supposed
    to be zero). There is two flags, D and E to disable and Enable
    interrupts.
    bi $0



    ---[ 3.1.3 - Exploiting Software Vulnerabilities in SPE

    First of all it's important to make it even more clear that it is
    impossible to, for example, force the SPE process to execute a new command
    (a.k.a. execve() shellcodes). The same happens for network-based library
    functions and others, as already explained we need the PPE to proxy that
    for us.

    So it open two new paths:
    - Create a PPE shellcode to be used while exploiting PPE software
    vulnerabilities that will spawn a proxy for commands received by
    the SPE and will create a SPE thread to do all the job -> This
    is pure PPC shellcode and this article already discussed
    everything needed to achieve that. In the attached sources you
    have samples in the directory cell-ppe/ [16].
    - Create a vulnerability specific code for the SPE, that will
    print out internal program information related to the exploited
    SPE. This is specially interesting and difficult because:
    * Need to remember that the SPE uses instruction-cache, so
    sometimes if you overflow just a small amount of bytes,
    it will be specially difficult to get it executed
    * If you use the wrap-around characteristics of the memory
    layout for the SPE, you will probably overwrite also the
    information you are interested in.

    In the other hand, it's important to say that everything the information
    will be in the same place (or easier to understand: there is no ASLR in
    the SPE). Running the attached samples (specially the SPE-SPE
    communications because it's printing the pointers addresses will make it
    clear to the reader).


    ---[ 3.1.3.1 - Avoiding Null Bytes

    It is important to avoid null bytes, so we cannot use the NOP instruction
    in our shellcode.

    This creates a problem, since the ori instruction will also generate null
    byte if used with 0 as an argument (e.g: ori $1, $1, 0).

    A good replacement is the instruction or (e.g: or $1, $1, $1) or the usage
    of multiple instructions (which will reduce the probability of your return
    address).


    ---[ 3.1.4 - Finding software vulnerabilities on SPE

    The simulator provided by IBM has a feature that monitors selected
    addresses or regions on the Local Store for read and write accesses. This
    feature can help identify stack overflows conditions.o

    Invoked from the simulator command windows as follows:
    enable_stack_checking [spu_number] [spu_executable_filename]

    This procedure uses the nm system utility to determine the area of the
    Local Storage that will contain the program code and creates trigger
    functions to trap writes by the SPU into this region.

    Important to notice that this approach are just looking for writes in the
    text and static data and not to the heap. Of course the same approach
    used by this feature could be used to help the creation of a fuzzer using
    TCL scripts based on the one provided.

    ------[ 4 - Future and other uses

    I can't foresee the future, but this kind of architectures are becoming
    more and more common and will open a wide range of new vulnerabilities.

    The complexity behind this kind of asymmetric multi-threaded architectures
    are even higher than the normal ones. The lack of memory protection will
    help also the attackers on how to subvert those systems. The main
    processor been based on an already well-known architecture (powerpc) also
    helps the dissemination of malicious codes.

    Many other researchers are doing stuff using Cell:
    - Nick Breese presented on Crackstation project in BlackHat [5]
    Basically he used the SIMD capabilities and big registers
    provided by the architecture to crack passwords [5]

    - IBM Researchers released a study about the usage of the Cell SPU
    as a Garbage Collector Co-processor [14]

    - Maybe there is JTAG-based interfaces on the cell machines to try
    to use RiscWatch [15]

    - Unfortunelly the SPU access are controlled by the PPE so run
    integrity protection mechanisms from SPU seens infeasible ->
    Anyway, I wrote a network traffic analyzer using cell as base
    architechture.



    ------[ 5 - Acknowledgments

    A lot of people helped me in the long way for these researches that
    resulted in something funny to be published, you all know who you are.

    Special thanks to the Phrack Staff for the great review of the article,
    giving a lot of important insights about how to better structure it and
    giving a real value to it.

    I always need to thanks to Filipe Balestra, my research partner, for
    sharing with me his ideas, feedbacks, comments and experiences improving a
    lot the article and the samples.

    I'll never ever forget to say thanks to my research team and friends at
    RISE Security (http://www.risesecurity.org) for always keeping me
    motivated studying completely new things. Be sure that the unix-asm [16]
    project will be updated soon with all the stuff showed here and much more
    different types of shellcodes for the architecture. Also, of course the
    updates will be available for Metasploit.

    Big thanks to the Cell Kernel guru, Andre Detsch for sharing with me his
    ideas and discussing the internals of the Linux implementation for Cell.

    Conference organizers who invited me to talk about Cell Software
    Exploitation, even after many people already talked about Cell they
    trusted that my talk was not about brute-forcing (yeah, a lot of fun in
    completely different cultures).

    To my girlfriend who waited for me (alone, I suppose) during this travels.

    It's impossible to not say thanks to COSEINC, for let me keep doing this
    research using important company time.

    ------[ 6 - References

    [1] Cell Broadband Engine Architecture, v1.01 October 2006
    http://cell.scei.co.jp/pdf/CBE_Architecture_v101.pdf

    [2] Sony Computer Entertainment
    http://www.sony.com

    [3] Toshiba Corporation
    http://www.toshiba.com

    [4] IBM Corporation
    http://www.ibm.com

    [5] Breese, Nick; "Crackstation"; Black Hat Europe 2008
    http://www.blackhat.com/presentations/bh-europe08/Bresse/Presentation/bh-eu-08-breese.pdf

    [6] IBM Power Architecture
    http://www-03.ibm.com/chips/power/

    [7] IBM Bladecenter QS21
    http://www.ibm.com/systems/bladecenter/hardware/servers/qs21/index.html

    [8] IBM Roadrunner Supercomputer
    http://en.wikipedia.org/wiki/IBM_Roadrunner

    [9] The cell project at IBM Research
    http://www.research.ibm.com/cell/

    [10] Cell Simulator
    http://www.alphaworks.ibm.com/tech/cellsystemsim

    [11] Cell resource center at developerWorks (SDK download)
    http://www-128.ibm.com/developerworks/power/cell/

    [12] Synergistic Processor Unit Instruction Set Architecture v1.2
    http://www-01.ibm.com/chips/techlib/techlib.nsf/techdocs/76CA6C7304210F3987257060006F2C44/$file/SPU_ISA_v1.2_27Jan2007_pub.pdf

    [13] Moore, H.D; "Mac OS X PPC Shellcode Tricks"; Uninformed Magazine 2005
    http://www.uninformed.org/?v=1&a=1&t=txt

    [14] Cher, Chen-Yong; Gschwind, Michael; "Cell GC: Using the Cell Synergistic Processor as a Garbage Collector Coprocessor"; 2008
    http://www.research.ibm.com/cell/papers/2008_vee_cellgc_slides.pdf

    [15] RISCWatch Debugger
    http://www.ibm.com/chips/techlib/techlib.nsf/products/RISCWatch_Debugger

    [16] Carvalho, Ramon de; "Cell PPE Shellcodes"; RISE Security;
    http://www.risesecurity.org/papers/lopbuffer.pdf


    Others:

    PowerPC User Instruction Set Architecture, Book I, v2.02 January 2005
    http://moss.csc.ncsu.edu/~mueller/cluster/ps3/SDK3.0/docs/arch/PPC_Vers202_Book1_public.pdf

    PowerPC Virtual Environment Architecture, Book II, v2.02 January 2005
    http://moss.csc.ncsu.edu/~mueller/cluster/ps3/SDK3.0/docs/arch/PPC_Vers202_Book2_public.pdf

    PowerPC Operating Environment Architecture, Book III, v2.02 January 2005
    http://moss.csc.ncsu.edu/~mueller/cluster/ps3/SDK3.0/docs/arch/PPC_Vers202_Book3_public.pdf

    Cell developer's corner at power.org
    http://www.power.org/resources/devcorner/cellcorner/

    Linux info at the Barcelona Supercomputing Center website
    http://www.bsc.es/projects/deepcomputing/linuxoncell


    ------[ 7 - Notes on SDK/Simulator Environment

    There is some pictures on the simulator and sdk running on the attached file:
    images/cell-sim1.jpg and images/cell-sim2.jpg

    To install the SDK/Simulator, do:
    - Download the Cell SDK ISO image from the IBM alphaWorks website.
    - Mount the disk image on the mount directory: mount -o loop
    CellSDK<version>.iso /mnt/phrack
    - Change directory to /mnt/phrack/software:
    - Install the SDK by using the following command and answer any
    prompts: ./cellsdk install

    To start the simulator: cd /opt/IBM/systemsim-cell/run/cell/linux
    ../run_gui Click on the 'go' button to start the simulated system

    To copy files to the simulated system (inside it run):
    callthru source /home/bsdaemon/Phrack/hello_ppu > hello_ppu

    Then give the correct permissions and execute:
    chmod +x hello_ppu
    ./hello_ppu



    ------[ 8 - Sources [cell_samples.tgz]

    Attached all the samples used on this article to be compiled in a Linux
    running on Cell machine.

    Further updates will be available in the RISE Security website at:
    http://www.risesecurity.org

    For the author's public key:
    http://www.kernelhacking.com/rodrigo/docs/public.txt


    begin 644 cell_samples.tgz

    end



    --------{ EOF
    Ok, I will use the random copy paste thread now.
    Found somewhere on the internet:

    Much like manatees being mistaken for mermaids, this has some basis in fact. The way that it seems to work is that an incompetent person sees a teen perform some miracle of computing - changing system settings, troubleshooting a common network problem, etc. So, since this kid was able to outdo them, the kid must be A) Brilliant (so he could figure it out when they couldn't) B) Highly skilled (because admitting that anyone with a reasonable amount of knowledge could do it would be admitting to ignorance). So then this person tells a friend about this 'genius', which is combined in the friends mind with tales of 'hackers' who can break into computers. And there you have it - the teenage genius hacker.
    added on the 2009-06-30 13:44:07 by Optimus Optimus
    ; c64 tiny 303 driver - 4mat/orb
    ; compiles with c64asm
    * = $1000
    ; init music
    sei
    lda #$00
    tay
    ldx #$fd
    argha
    sta $02,x
    dex
    bne argha
    setsound
    lda #$41
    sta 54276,x
    lda #$08
    sta 54275,x
    lda #$00
    sta 54277,x
    lda vols,y
    sta 54278,x
    lda datas,y
    sta $6b,y
    lda siddata,y
    sta 54295,y
    iny
    clc
    txa
    adc #$07
    tax
    cpx #$15
    bne setsound
    lda #<musicloop
    sta $0314
    lda #>musicloop
    sta $0315
    cli
    loop
    jmp loop
    ; play music
    musicloop dec $60
    bpl updatedrums
    lda #$06
    sta $60
    ldy $61
    lda $e1b5,y
    and #$0f
    sta 54273
    lda $e1b2,y
    and #$0f
    sta 54273+7
    noresetsq
    ldx #$ff
    drumcheck
    lda beat,y
    and btt,x
    beq nobit
    lda wav,x
    sta 54276+14
    sta $66
    lda not,x
    sta $67
    lda plu,x
    sta $68
    lda del,x
    sta $69
    nobit dex
    bpl drumcheck
    iny
    tya
    and #$0f
    sta $61
    bne noupdate
    dec $6b
    bpl noupdate
    lda #$17
    sta filtsweep+$01
    lda #$a0
    sta noupdate+$01
    dec noresetsq+$01
    lda noresetsq+$01
    and #$03
    sta noresetsq+$01
    tax
    lda basswaves,x
    sta 54276
    lda length,x
    sta $6b
    lsr
    sta length,x
    dec $6c
    lda $6c
    bpl noupdate
    jmp 64738
    noupdate
    lda #$2f
    sta $62
    updatedrums
    nogate lda $60
    cmp $69
    bne nodrumgate
    dec $66
    lda $66
    sta 54276+14
    nodrumgate
    clc
    lda $67
    adc $68
    sta $67
    sta 54273+14
    lda $62
    sta 54294
    filtsweep sbc #$00
    sta $62
    jmp $ea31
    ; music data
    datas .byte $05,$07
    siddata .byte $53,$1f,$00
    basswaves .byte $11,$51,$21,$41
    length .byte $01,$03,$07,$03
    vols .byte $c9,$49,$d5
    btt .byte $40,$02,$08,$00
    wav .byte $41,$81,$81
    not .byte $0a,$ff,$20
    plu .byte $ff,$fe,$fc
    del .byte $03,$02,$01
    beat .byte $40,$01,$02,$01,$08,$01,$02,$40,$40,$01,$02,$01,$08,$01,$02,$01
    added on the 2009-06-30 14:01:08 by 4mat 4mat

    login