Daneel: Type inference for Dalvik bytecode
In the last blog post about Daneel I mentioned one particular caveat of Dalvik bytecode, namely the existence of untyped instructions, which has a huge impact on how we transform bytecode. I want to take a similar approach as last time and look at one specific example to illustrate those implications. So let us take a look at the following Java method.
public float untyped(float[] array, boolean flag) { if (flag) { float delta = 0.5f; return array[7] + delta; } else { return 0.2f; } }
The above is a straightforward snippet and most of you probably know how the generated Java bytecode will look like. So let’s jump right to the Dalvik bytecode and discuss that in detail.
UntypedSample.untyped:([FZ)F: [regs=5, ins=3, outs=0] 0000: if-eqz v4, 0009 0002: const/high16 v0, #0x3f000000 0004: const/4 v1, #0x7 0005: aget v1, v3, v1 0007: add-float/2addr v0, v1 0008: return v0 0009: const v0, #0x3e4ccccd 000c: goto 0008
Keep in mind that Daneel doesn’t like to remember things, so he wants to look through the code just once from top to bottom and emit Java bytecode while doing so. He gets really puzzled at certain points in the code.
- Label 2: What is the type of register
v0
? - Label 4: What is the type of register
v1
? - Label 9: Register
v0
again? What’s the type at this point?
You, as a reader, do have the answer because you know and understand the semantic of the underlying Java code, but Daneel doesn’t, so he tries to infer the types. Let’s look through the code in the same way Daneel does.
At method entry he knows about the types of method parameters. Dalvik passes parameters in the last registers (in this case in v3
and v4
). Also we have a register (in this case v2
) holding a this
reference. So we start out with the following register types at method entry.
UntypedSample.untyped:([FZ)F: [regs=5, ins=3, outs=0] uninit uninit object [float bool
The array to the right represents the inferred register types at each point in the instruction stream as determined by the abstract interpreter. Note that we also have to keep track of the dimension count and the element type for array references. Now let’s look at the first block of instructions.
0002: const/high16 v0, #0x3f000000 u32 uninit object [float bool 0004: const/4 v1, #0x7 u32 u32 object [float bool 0005: aget v1, v3, v1 u32 float object [float bool 0007: add-float/2addr v0, v1 float float object [float bool
Each line shows the register type after the instruction has been processed. At each line Daneel learns something new about the register types.
- Label 2: I don’t know the type of
v0
, only that it holds an untyped 32-bit value. - Label 4: Same applies for
v1
here, it’s an untyped 32-bit value as well. - Label 5: Now I know
v1
is used as an array index, it must have been an integer value. Also the array reference in registerv3
is accessed, so I know the result is a float value. The result is stored inv1
, overwriting it’s previous content. - Label 7: Now I know
v0
is used in a floating-point addition, it must have been a float value.
Keep in mind that at each line, Daneel emits appropriate Java bytecode. So whenever he learns the concrete type of a register, he might need to retroactively patch previously emitted instructions, because some of his assumptions about the type were broken.
Finally we look at the second block of instructions reached through the conditional branch as part of the if
-statement.
0009: const v0, #0x3e4ccccd u32 uninit object [float bool 000c: goto 0008 float uninit object [float bool
When reaching this block we basically have the same information as at method entry. Again Daneel learns in the process.
- Label 9: I don’t know the type of
v0
, only that it holds an untyped 32-bit value. - Label 12: Now I know that
v0
has to be a float value because the unconditional branch targets the join-point at label 8. And I already looked at that code and know that we expect a float value in that register at that point.
This illustrates why our abstract interpreter also has to remember and merge register type information at each join-point. It’s important to keep in mind that Daneel follows the instruction stream from top to bottom, as opposed to the control-flow of the code.
Now imagine scrambling up the code so that instruction stream and control-flow are vastly different from each other, together with a few exception handlers and an optimal register re-usage as produced by some SSA representation. That’s where Daneel still keeps choking at the moment. But we can handle most of the code produced by the dx
tool already and will hunt down all those nasty bugs triggered by obfuscated code as well.
Disclaimer: The abstract interpreter and the method rewriter were mostly written by Rémi Forax, with this post I take no credit for it’s implementation whatsoever, I just want to explain how it works.
For me, this is a very good
For me, this is a very good site. Thank you!
Keep going, your page helps a
Keep going, your page helps a lot of newbies.
Fantastic website. This is an
Fantastic website. This is an excellent guide.
I found this article
I found this article interesting since I am currently studying related to this topic.
https://www.greenempirebuilder.com https://www.apertureoptix.com https://splashhhmedia.com https://peninsulapaintingservices.com
Awesome site. This is a
Awesome site. This is a really great guide.
<a href="https://www.sidingcoquitlambc.com">Siding Services</a>
I appreciate you for sharing
I appreciate you for sharing this!
This is really a good post.
This is really a good post. Glad I saw it!
This is an informative post.
This is an informative post. It's great that you have shared this!
This is really a great blog!
This is really a great blog! I am looking forward to reading more of this!
Me Too and so are the guys
Me Too and so are the guys over at https://www.manchesterroofing.co.uk
Thank you for sharing this!
Thank you for sharing this!
Awesome article. Glad to saw
Awesome article. Glad to saw this article of yours.
Interesting. The array to the
Interesting. The array to the right represents the inferred register types at each point in the instruction stream as determined by the abstract interpreter.
Very informative article.
Very informative article. Thanks for sharing your idea.
This is really what I want
This is really what I want to.
An executable file stored in
An executable file stored in a format called DEX includes compiled code created for Google's Android mobile phone platform, which runs Linux. It may be interpreted by the Dalvik virtual computer and is formally known as a "Dalvik Executable."
In the Android operating
In the Android operating system, Dalvik is a defunct process virtual machine (VM) that runs specifically designed Android apps. (In subsequent Android versions, the Dalvik bytecode format is no longer used during runtime but is still used as a distribution format.)
This is so informative.
This is so informative. Thanks
Makrome Anahtarlık çapraz
Makrome Anahtarlık çapraz kare modeli Makrome Anahtarlık Yapılışı Türkçe Videolu Resimli Anlatım Evinizin melekler tarafından dualandığı bol bereketli günler olsun canlarım, makrome anahtarlık bugün benim canlı renklerine bayıldığım ve en kısa zamanda örmeyi düşündüğüm 2017 Lif Örneği olan Çapraz Kare Modeli Lif Yapımı nı sizlere sunuyorum,
Cool I love this article
Cool I love this article .it's standardized, clear, and concise ,that I never heard about it before.
Yeey! this is so amazing.
Yeey! this is so amazing. Thanks for sharing
This is really amazing...
This is really amazing...
Pretty post. Thanks,
Pretty post. Thanks,
Nice post! Very interesting
Nice post! Very interesting blog you shared
You Guy's are doing an
You Guy's are doing an excellent job. I find my website far better than what I was expecting with in short time period. Looking forward to work with you again. Thank You. Best of luck for near future!
https://victoriasbestflooring
https://victoriasbestflooring.com.au/
Bradenton Tow Truck provides
Bradenton Tow Truck provides fast and reliable towing and emergency roadside assistance services.
https://bradentontowtruck.com/
https://synthesis.capital/
https://synthesis.capital/
https://socialstatus.com.au/
https://socialstatus.com.au/
https://www.satori.health/
https://www.satori.health/
https://paccapital.com.au
https://paccapital.com.au
I really enjoyed reading this
I really enjoyed reading this post, big fan. Keep up the good work andplease tell me when can you publish more articles or where can I read more on the subject? <a href="https://weddingchip.com/sofreh-aghd-photo/">about his</a>
https://nlsinspections.com.au
https://nlsinspections.com.au
https://cortexmma.com.au/
https://cortexmma.com.au/
https://citywidesafeandlock.c
https://citywidesafeandlock.com/
https://amoura.com.au
https://amoura.com.au
Offering Lead Generation,
Offering Lead Generation, SEO, and Web Design Services. Visit now this website for a free audit.
Now imagine scrambling up the
Now imagine scrambling up the code so that instruction stream and control-flow are vastly different from each other, together with a few exception handlers and an optimal register re-usage as produced by some SSA representation. That’s where Daneel still keeps choking at the moment. But we can handle most of the code produced by the dx tool already and will hunt down all those nasty bugs triggered by obfuscated code as well.
We help some clients to
We help some clients to select perfect window treatments for their homes and offices and We provide a variety of blinds and other window coverings to fit any design or price range. Visit this website for a free quote!
your curtain installation is
your curtain installation is superb! We offer transfers through shuttle and buses at an affordable cost for you.
Offering services like
Offering services like manicure/pedicure, You’ll also find specific beauty treatments such as face care and hair removal treatment, they provide your beauty need. Visit this website for more info.
Well done! Thank you for the
Well done! Thank you for the Dalvic bytecode. These codes are very useful, and Please keep sharing for more codes
This is great, should have
This is great, should have seen this sooner.
These codes are very useful,
These codes are very useful, and Please keep sharing for more codes
https://lapuenteconcretecontractor.com/
Codes are helpful indeed.
Codes are helpful indeed. Glad that this blog shares a few.
There is a lot of new
There is a lot of new information coming to light and it would be useful if you could give some updates since your opinion is so valued.
Nice response in return of
Nice response in return of this issue with firm arguments and explaining everything on the topic of that.
Thank you; I've been hunting
Thank you; I've been hunting for information on this subject recently.
it is very impressive and
it is very impressive and informative content good work keep it up
So lucky to come across your
So lucky to come across your excellent blog. Your blog brings me a great deal of fun. Good luck with the site.