Daneel: Type inference for Dalvik bytecode

In the last blog post about Daneel I mentioned one particular caveat of Dalvik bytecode, namely the existence of untyped instructions, which has a huge impact on how we transform bytecode. I want to take a similar approach as last time and look at one specific example to illustrate those implications. So let us take a look at the following Java method.

public float untyped(float[] array, boolean flag) {
   if (flag) {
      float delta = 0.5f;
      return array[7] + delta;
   } else {
      return 0.2f;
   }
}

The above is a straightforward snippet and most of you probably know how the generated Java bytecode will look like. So let’s jump right to the Dalvik bytecode and discuss that in detail.

UntypedSample.untyped:([FZ)F:
  [regs=5, ins=3, outs=0]
   0000: if-eqz v4, 0009
   0002: const/high16 v0, #0x3f000000
   0004: const/4 v1, #0x7
   0005: aget v1, v3, v1
   0007: add-float/2addr v0, v1
   0008: return v0
   0009: const v0, #0x3e4ccccd
   000c: goto 0008

Keep in mind that Daneel doesn’t like to remember things, so he wants to look through the code just once from top to bottom and emit Java bytecode while doing so. He gets really puzzled at certain points in the code.

  • Label 2: What is the type of register v0?
  • Label 4: What is the type of register v1?
  • Label 9: Register v0 again? What’s the type at this point?

You, as a reader, do have the answer because you know and understand the semantic of the underlying Java code, but Daneel doesn’t, so he tries to infer the types. Let’s look through the code in the same way Daneel does.

At method entry he knows about the types of method parameters. Dalvik passes parameters in the last registers (in this case in v3 and v4). Also we have a register (in this case v2) holding a this reference. So we start out with the following register types at method entry.

UntypedSample.untyped:([FZ)F:
  [regs=5, ins=3, outs=0]               uninit uninit object [float bool

The array to the right represents the inferred register types at each point in the instruction stream as determined by the abstract interpreter. Note that we also have to keep track of the dimension count and the element type for array references. Now let’s look at the first block of instructions.

   0002: const/high16 v0, #0x3f000000   u32    uninit object [float bool
   0004: const/4 v1, #0x7               u32    u32    object [float bool
   0005: aget v1, v3, v1                u32    float  object [float bool
   0007: add-float/2addr v0, v1         float  float  object [float bool

Each line shows the register type after the instruction has been processed. At each line Daneel learns something new about the register types.

  • Label 2: I don’t know the type of v0, only that it holds an untyped 32-bit value.
  • Label 4: Same applies for v1 here, it’s an untyped 32-bit value as well.
  • Label 5: Now I know v1 is used as an array index, it must have been an integer value. Also the array reference in register v3 is accessed, so I know the result is a float value. The result is stored in v1, overwriting it’s previous content.
  • Label 7: Now I know v0 is used in a floating-point addition, it must have been a float value.

Keep in mind that at each line, Daneel emits appropriate Java bytecode. So whenever he learns the concrete type of a register, he might need to retroactively patch previously emitted instructions, because some of his assumptions about the type were broken.

Finally we look at the second block of instructions reached through the conditional branch as part of the if-statement.

   0009: const v0, #0x3e4ccccd          u32    uninit object [float bool
   000c: goto 0008                      float  uninit object [float bool

When reaching this block we basically have the same information as at method entry. Again Daneel learns in the process.

  • Label 9: I don’t know the type of v0, only that it holds an untyped 32-bit value.
  • Label 12: Now I know that v0 has to be a float value because the unconditional branch targets the join-point at label 8. And I already looked at that code and know that we expect a float value in that register at that point.

This illustrates why our abstract interpreter also has to remember and merge register type information at each join-point. It’s important to keep in mind that Daneel follows the instruction stream from top to bottom, as opposed to the control-flow of the code.

Now imagine scrambling up the code so that instruction stream and control-flow are vastly different from each other, together with a few exception handlers and an optimal register re-usage as produced by some SSA representation. That’s where Daneel still keeps choking at the moment. But we can handle most of the code produced by the dx tool already and will hunt down all those nasty bugs triggered by obfuscated code as well.

Disclaimer: The abstract interpreter and the method rewriter were mostly written by Rémi Forax, with this post I take no credit for it’s implementation whatsoever, I just want to explain how it works.

Very informative and helpful.

Very informative and helpful.

This type of topic is usually

This type of topic is usually motivating, and I love to read great stuff, so I'm pleased to find a nice spot for many in this post, similar to how wonderful credit repair in Houston is.

So much owe to your blog.

So much owe to your blog. Very helpful.

your aggregate driveways are

your aggregate driveways are the best. we can partner up to provide for the asphalts.

I found some other article

I found some other article contains tactics relating to JAVA yet your article adding some another informational data.

Same here. I am on the

Same here. I am on the lookout for reliable services.

Great article. Great code.

Great article. Great code.

Java is really my weakness

Java is really my weakness but I am glad that there's someone who is willing to share their knowledge and skills in Java.

You know what, I am not that

You know what, I am not that expert in java but here you are helping someone who are new into this field.

This data helps.

This data helps.

Do you have any articles that

Do you have any articles that will support this kind of coding? My apologies for this, I am not that expert and I want to know more.

Keep the good work.

Keep the good work.

The article with the data is

The article with the data is very helpful.

You have a good data! Thanks

You have a good data! Thanks for sharing

You have an informative data!

You have an informative data! Thanks

this is great. thank you!

this is great. thank you!

Thank you for sharing it with

Thank you for sharing it with us!

Hi, this is great. Thank you!

Hi, this is great. Thank you!

Thank you for the

Thank you for the information.

I love your posts! Thanks for

I love your posts! Thanks for contributing the best tips in an approachable.

I found that site very useful

I found that site very useful and this survey is very curious. I've never seen a blog that demand a survey for this actions. very curious …

Thanks for your usual

Thanks for your usual wonderful effort.

I agree with this wholey!

I agree with this wholey!

sorry I meant

Awesome! Your way of writing

Awesome! Your way of writing and making things clear is very impressive. Thanking you for such an informative article,
https://www.contractorlethbridge.com

I found that site very useful

I found that site very useful and this survey is very curious. I've never seen a blog that demand a survey for this actions. very curious …

Totally agree with your way

Totally agree with your way of thinking. Thank you for sharing.

It was very useful. Thank you

It was very useful. Thank you for sharing.

Really nice and interesting

Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one.

It was interesting indeed -

It was interesting indeed - https://www.roofersofbirmingham.co.uk

Really nice and interesting

Really nice and interesting post. I was looking for this kind of information and enjoyed reading this one.
<a href="https://www.contractorlethbridge.com">see this</a>

Excellent website. This is a

Excellent website. This is a fantastic resource.

Fantastic website. This is an

Fantastic website. This is an excellent resource.

Thanks for sharing! Visit

Thanks for sharing! Visit here for more info.

Thank you so much for such an

Thank you so much for such an amazing blog. I hope that there's more to come.

For me, this is a very good

For me, this is a very good site. Thank you!

Fantastic website. This is an

Fantastic website. This is an excellent guide.

I found this article

I found this article interesting since I am currently studying related to this topic.
https://www.greenempirebuilder.com https://www.apertureoptix.com https://splashhhmedia.com https://peninsulapaintingservices.com

Awesome site. This is a

Awesome site. This is a really great guide.
<a href="https://www.sidingcoquitlambc.com">Siding Services</a>

I appreciate you for sharing

I appreciate you for sharing this!

This is really a good post.

This is really a good post. Glad I saw it!

This is an informative post.

This is an informative post. It's great that you have shared this!

This is really a great blog!

This is really a great blog! I am looking forward to reading more of this!

Me Too and so are the guys

Me Too and so are the guys over at https://www.manchesterroofing.co.uk

Thank you for sharing this!

Thank you for sharing this!

Awesome article. Glad to saw

Awesome article. Glad to saw this article of yours.

Interesting. The array to the

Interesting. The array to the right represents the inferred register types at each point in the instruction stream as determined by the abstract interpreter.

Very informative article.

Very informative article. Thanks for sharing your idea.

This is really what I want

This is really what I want to.

An executable file stored in

An executable file stored in a format called DEX includes compiled code created for Google's Android mobile phone platform, which runs Linux. It may be interpreted by the Dalvik virtual computer and is formally known as a "Dalvik Executable."