ESPtool - the magic sflash stub

The open source ESP8266 esptool that’s part of the esp-open-sdk is used as a tool to create a firmware image from elf image/binary. It is also used to flash/program an ESP8266 with this firmware image. In addition you can do the opposite, download a firmware image.


Browsing through the source code I found an interesting piece of binary blob I just had to take a closer look at.

# Sflash stub: an assembly routine to read from spi flash and send to host
SFLASH_STUB = "\x80\x3c\x00\x40\x1c\x4b\x00\x40\x21\x11\x00\x40\x00\x80\xfe" \
              "\x3f\xc1\xfb\xff\xd1\xf8\xff\x2d\x0d\x31\xfd\xff\x41\xf7\xff" \
              "\x4a\xdd\x51\xf9\xff\xc0\x05\x00\x21\xf9\xff\x31\xf3\xff\x41" \

It is used in the following python function.

""" Read SPI flash """
def flash_read(self, offset, size, count=1):
    # Create a custom stub
    stub = struct.pack('<III', offset, size, count) + self.SFLASH_STUB

    # Trick ROM to initialize SFlash
    self.flash_begin(0, 0)

    # Download stub
    self.mem_begin(len(stub), 1, len(stub), 0x40100000)
    self.mem_block(stub, 0)

This means the stub is prefixed with 3x 32bit integers, respectively offset, size and count. The stub plus prefix is flashed into memory @ 0x40100000 and then execution starts @ 0x4010001c meaning that first 4x 32bit integers of the stub is probably data.


I figured, this was a perfect use case of showing how to use radare. Here it is: asciinema

There you have it! esptool uploads the sflash stub and executes it. The stub itself contains a small function which calls two functions in a loop. First SPIRead is called to fetch a block from flash into RAM. Then send_pakcet is called to send this block back over UART. Both SPIRead and send_packet resides in ROM and are so called ROM functionality.


ESP8266 is a wonderful little piece of hardware I stumbled over some months ago. It has gotten a lot of attention since some time back in 2014. Mainly because it is an incredible cheap WiFi module (less than $2 these days). It’s not until recently I’ve had some time to actually look into it. Personally I like to know how things work from bottom up, so that means digging down into electronics and the firmware/software you can either use out of the box or build/construct yourself for ESP8266.

ESP-12E and a banana for scale


I early on decided that I wanted to know more about the chip and architecture. So far I’ve been doing most research on the components of the ESP8266EX version. The CPU, memory and how it boots.


ESP8266(EX) is made by Espressif and there are multiple vendors out there selling different kind of modules embedding it. See this list which are somewhat updated. Personally I’ve bought ESP-01 and ESP-12E and will probably mostly use the latter. ESP-12E is from AI-thinker based on ESP8266EX chip. See Espressifs newest datasheet (0A-ESP8266EX__Datasheet__EN_V4.7_20160225.pdf)


Espressif is using Tensilica Xtensa in their architecture. This is a 32bit RISC architecture. The specific model is L106 also known as lx106. As far as I understand the architecture doesn’t have any form for CPU cache.

See Max Filippovs crosstool-ng repo for a C/C++ toolchain. You will also need some carefully written linker scripts. An alternative or better option would be to use the esp-open-sdk which bundles the toolchain, Xtensa HAL, esptool and the Espressif SDK. Very handy!


There is a lot of different conflicting information on the inter tubes. The datasheet doesn’t say a lot about it apart from the fact that it includes SRAM and ROM. Apparently there are separate data and instruction buses, respectively dBus and iBus, + a AHB bus. SRAM nor ROM size is specified explicitly. However, this is said about size in the datasheet:

When ESP8266EX is working under the station mode and is connected to the router, programmable space is accessible to user in heap and data section is 50KB.

Information gathered from the linker scripts and looking at how the stack is setup after boot I’ve concluded:

Address Size Description
0x3FFE8000 0x14000 RAM data - Heap
0x3FFEC000 0x4000 RAM data - Stack
0x40000000 0x10000 ROM (incl. bootloader)
0x40100000 0x8000 RAM instructions

So there should atleast be:

  • 96kB data RAM
  • 32KB instruction RAM

In addition to this, SPI flash can be mapped to memory through a cache of some sorts. This is used to read instructions directly from flash instead of copying it over to instruction RAM first. Flash cache should be slower than RAM, but I haven’t done any testing myself.


By using esptool ./ --port /dev/ttyUSB0 flash_id my ESP-12E identifies the flash chip:

Manufacturer: e0
Device: 4016

Which I believe is BergMicro BG25Q32 (32Mbit - 4MB).


I’ve skipped all the electronics that you probably either already know or should read more about elsewhere. Buying a module like the ESP-12E require you to do a bit of soldering. The inter tubes are full of articles/posts of how to connect your ESP8266 using a breadboard, protoboard or whatever. Google it!

One thing I should mention, that made me hurt for a while, is the fact that when you boot the stock (AT) firmware, ESP8266 draws >200mA. This means that powering it from USB through your USB to serial adaptor will probably NOT work. I learned this the hard way; After a day of cursing and knocking my head against the wall. You are better of using a separate power supply which can handle 500mA @ 3.3V. A capacitor or two across VDD/VCC and ground close to the chip is probably also recommended. I’m an electronics rookie, so if you are having problems, talk to an expert.


There are a couple of things you should know about the serial interface.

  1. The first stage bootloader initializes UART0 with baud rate based on the external crystal oscillator.
  2. The stock (AT) firmware and probably others use a baud rate of 115200.
  3. Booting in UART mode (also known as flashing/programming mode) uses auto sensing baud rate AFAIK.

I’ve experimented with a couple of applications on Linux for serial communication to ESP8266. I used picocom for a while because it’s pretty straight forward and supports CRLF which is what you need for executing AT commands using the AT firmware. Stock picocom doesn’t support custom baud rates. You would need to build picocom yourself where you enable it. So reading bootloader messages wouldn’t work out of the box. The same thing goes for minicom and screen. You could probably make it work initializing the serial terminal using stty. As I’m no expert on serial terminals, serial communication drivers nor the kernel infrastructure for this, I’ve just tried to find whats easiest to use.

I highly recommend using (included in pyserial python package). /dev/ttyUSB0 74880 for reading the bootloader message and /dev/ttyUSB0 115200 for anything else unless you want high speed communication and/or you have firmware that uses something else.

Typical output from the bootloader:

 ets Jan  8 2013,rst cause:2, boot mode:(3,7)

load 0x40100000, len 1396, room 16
tail 4
chksum 0x89
load 0x3ffe8000, len 776, room 4
tail 4
chksum 0xe8
load 0x3ffe8308, len 540, room 4
tail 8
chksum 0xc0
csum 0xc0

2nd boot version : 1.4(b1)
  SPI Speed      : 40MHz
  SPI Mode       : DIO
  SPI Flash Size & Map: 8Mbit(512KB+512KB)
jump to run user1 @ 1000

boot mode:(x,y) gives x = 3 and y = 7 where x is the boot mode.
rst cause:z gives z = 2

Code Cause
1 Normal boot
2 Reset pin
3 Software reset
4 Watchdog reset

Boot modes

When you reset the ESP8266 it can boot in a couple of different modes.

MTDO GPIO0 GPIO2 Code Mode Description
L L H 1 UART Download over UART
L H H 3 Flash Boot from SPI Flash
H x x 4-7 SDIO Boot from SD

L = low
H = high
x = floating
MTDO, GPIO0, GPIO2 (3 bit) forms Code.

Whats next?

I’m going to research a bit more to find out how both Flash and UART boot mode actually works. Both modes are already somewhat reverse engineered and documented elsewhere, but it is fun trying to understand and then explain it in a later post. So be sure to check back later for more! :)

Contributing to Radare2

You might wonder; What’s Radare or Radare2? Well, head over to and check it out. Basically it is a reverse engineering tool for static software anlysis. You can also use it for debugging I guess, but my main use of it will be reverse engineering binary blobs by disassembling machine code to understand how it works.

Moving over to github pages

It’s weird. A year ago or so I thought using wordpress would be easy and simple enough to actually write something quite often. It turns out I was wrong. First of all I don’t like wordpress for it’s technology. It’s PHP! :P And uses a database (MySQL). Second, it is pain to use. Log in, use a WYSIWYG editor and publish through the browser. Changing wordpress, extending or touching it apart from installing plugins seems tedious and waste of time. Even just hunting for the proper plugins is waste of time if you ask me. So…

Mouse support in tmux 2.1

I’m a vim+tmux user/believer! I recently installed OS X el capitan and updated/upgraded most of my homebrew Cellar to put it like that. Now, by doing that I got tmux 2.1 where apparently mouse-mode/mode-mouse is completely rewritten. I don’t use the mouse for much, but still nice to have when resizing panes.

So, last week I signed up for to watch some Europa League fotball (soccer). The site looks trendy and good, and my experience watching live sports for 90 minutes was overall good. No HD from what I could find and some occasional hiccups in video streaming. I can live with that, although HD would be nice!

Hafslund Android app

Some weeks ago I lost my Android phone into a bucket of paint. It turned out that the phone didn’t quite like that. I turned it off and gave it a good long shower, - then blow dried it for a 15 minutes. After some hours I turned it on, which I regret. I should have waited for a day or two - or even try to extract more of the moisture out of it. Well, it booted - but I noticed something wasn’t quite right. So after some minutes it locked up. Trying to boot the phone now doesn’t work as the kernel seems to be panicking before complete boot.


Last week Ryan Lortie blogged about some new macros I really have been missing for years of doing GObject development.

GDB - run until crash

When you write code you most likely want to add test coverage for it by adding a unit test, module test or whatever you want to call it. Personally I mainly work on highly threaded C/C++ code, where there inevitably will be bugs. Either in the form of non-deterministic tests or in the code it self as deadlocks or data races. In my current position we have an extensive CI system which from time to time report that there is such a test. Even though the test runs millions of times each day, it could be months or years between runs that would make the test fail.

TCP socket - SO_SNDBUF

I came across some code that after connecting a non-blcoking TCP socket did

int sndsize = 2048;
setsockopt (socket, SOL_SOCKET, SO_SNDBUF, &sndsize, sizeof (int));