Modeling of Architectures

Alglave, Jade

doi:10.1007/978-3-319-18941-3_3

Modeling of Architectures

Jade Alglave¹⁵

Chapter
First Online: 01 January 2015

668 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9104))

Abstract

Concurrent programming is known to be quite hard. It is made even harder by the fact that, very often, the execution models of the machines we run our software on are not precisely defined.

This document is a tutorial on the herd tool and the cat language, in which one can define consistency models.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Alglave, J.: A shared memory poetics. Ph.D. thesis, Université Paris 7 (2010)
Google Scholar
Alglave, J.: A formal hierarchy of weak memory models. Formal Methods Syst. Des. 41(2), 178–210 (2012)
Article MATH Google Scholar
Alglave, J., Batty, M., Donaldson, A.F., Gopalakrishnan, G., Ketema, J., Poetzl, D., Sorensen, T., Wickerson, J.: GPU concurrency: weak behaviours and programming assumptions. In: ASPLOS (2015)
Google Scholar
Alglave, J., Kroening, D., Lugton, J., Nimal, V., Tautschnig, M.: Soundness of data flow analyses for weak memory models. In: Yang, H. (ed.) APLAS 2011. LNCS, vol. 7078, pp. 272–288. Springer, Heidelberg (2011)
Chapter Google Scholar
Alglave, J., Maranget, L., Tautschnig, M.: Herding cats: modelling, simulation, testing, and data-mining for weak memory. TOPLAS 36(2) (2014)
Google Scholar
The Beatles: The Continuing Story of Bungalow Bill. In: White Album (1968)
Google Scholar
Hower, D.R., Hechtman, B.A., Beckmann, B.M., Gaster, B.R., Hill, M.D., Reinhardt, S.K., Wood, D.A.: Heterogeneous-race-free memory models. In: ASPLOS 14 (2014)
Google Scholar
Lamport, L.: How to make a multiprocessor computer that correctly executes multiprocess programs. IEEE Trans. Comput. 28(9), 690–691 (1979)
Article MATH Google Scholar
Roxy Music: Re-Make Re-Model. In: Roxy Music (1972)
Google Scholar
Owens, S., Sarkar, S., Sewell, P.: A better x86 memory model: x86-TSO. In: Berghofer, S., Nipkow, T., Urban, C., Wenzel, M. (eds.) TPHOLs 2009. LNCS, vol. 5674, pp. 391–407. Springer, Heidelberg (2009)
Chapter Google Scholar
SPARC International Inc.: The SPARC Architecture Manual Version 9 (1994)
Google Scholar
Sparks: Here Kitty. In: Hello Young Lovers (2006)
Google Scholar
T-Rex: Jeepster. In: Electric Warrior (1971)
Google Scholar
Vafeiadis, V., Balabonski, T., Chakraborty, S., Morisset, R., Nardelli, F.Z.: Common compiler optimisations are invalid in the C11 memory model and what we can do about it. In: POPL (2015)
Google Scholar
Sonic Youth: Kool Thing. In: Goo (1990)
Google Scholar
Led Zeppelin: Black Dog. In: Led Zeppelin IV (1971)
Google Scholar

Download references

Author information

Authors and Affiliations

Microsoft Research Cambridge and University College London, London, UK
Jade Alglave

Authors

Jade Alglave
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jade Alglave .

Editor information

Editors and Affiliations

Università di Urbino "Carlo Bo", Urbino, Italy
Marco Bernardo
University of Oslo, Oslo, Norway
Einar Broch Johnsen

Appendices

A Answers to Exercises

1.1 A.1 First Step in Herding Cats

Exercise: what do you think?. Do you think such an execution is possible?

Yes! On IBM Power or ARM machines for example [5].

1.2 A.2 First Big Cat: Tiger

Exercise: SC per Location with Load-Load Hazard. Certain architectures, such as Sparc RMO [11] allow what is sometimes called load-load hazard, i.e. a situation where the coRR test that we’ve just seen is allowed to yield the result 0:r1=1; 0:r2=0;. How do you think we can build a check that forbids all tests coWW, coRW1, coRW2 and coWR, but allows the test coRR?

In cat speak:

Intuitively what we’re doing here is removing the read-read pairs (R*R) from po-loc to build the po-loc-llh relation, then enforcing that this new relation is compatible with the communication relations com.

Let’s run herd on coWW, coRW1, coRW2, and coWR, and observe that they are still forbidden. Now let’s run it on coRR and observe that it is allowed. We’ve built a check that enforces SC per location but allows for load-load hazards!

Exercise: Implementing Address Dependencies. How do you think we can implement address dependencies? By this I mean that I’d like to see a sequence of instructions which, placed between two reads, implement e.g. an address dependency.

Consider the following variant of MP; the reading thread P1 has an address dependency between its two reads:

This dependency, in conjunction with the lightweight fence on P0, should be enough to forbid the non-SC execution of MP.

Exercise: Distributed. 2+2w Consider the following litmus test, which is essentially a distributed variant of 2+2w:

Which fences should we use to forbid the final state? Why? Where should we put them?

We should put a lightweight fence between the read-write pair on P1, and a lightweight fence between the write-write pair on P2, like so:

This is because the A-cumulativity of the lightweight fence on P1 will impose an ordering between the write of x on P0 and the write of y on P1.

Exercise: Independent Reads of Independent Writes. Consider the following litmus test, known as IRIW:

Which fences should we use to forbid the final state? Why? Where should we put them?

We should put a heavyweight fence between the read-read pairs on P1 and P3, like so:

This is because IRIW essentially is a distributed variant of the store buffering example, therefore reacts to fences in much the same way as SB, just like ISA2 or WRC react to fences (lightweight in their case) in much the same way as MP.

Thus the A-cumulativity of the heavyweight fence will impose an ordering between the write of x on P0 and the read of y on P1 (idem for the write of y on P2 and the read of x on P3).

Exercise: SC and TSO Re-Make Re-Model [9]. Remember that we’ve defined SC and (not quite) TSO earlier. Try to reformulate both models in terms of the five principles we’ve just learnt: sc per location, no thin air, propagation, observation and restoring sc. The SC model you’ll end up with will be equivalent to the one above. The TSO model you’ll end up with will be TSO proper!

Take the tiger cat file, and use the following bell files. For SC:

and for TSO:

1.3 A.3 Second Big Cat: Jaguar

Exercise: Difference Between RMO and Power or ARM What’s a test that distinguishes RMO from Power or ARM (as we’ve defined them previously)? More precisely, what’s a test that’s forbidden on RMO but allowed on Power or ARM?

IRIW+deps distinguishes RMO from Power or ARM:

because it is forbidden on RMO [1, 2], but allowed on Power and ARM [5].

Exercise: Implementing Scope Inclusion. How would you implement an RMO per scope model as above, but where the fences have an effect not only at their scopes, but also at narrower scopes? That is, the fence for system also has an effect at gpu and cta level for example? This procedure should forbid mp-mit-scopes+fgpu+fsys, but not mp-mit-scopes+fcta+fgpu.

We need to invent a recursive notion of wider, that will return, given a scope level s, all scope levels wider than s, not just the immediately wider one:

We can then use this new notion in the way we define our rmo-fences:

Note how we replace the call to tag2events by a call to wider2 in the definition of rmo-fences.

Exercise: Release Sequence. In C++ there’s a notion of release sequence, which says that synchronisation does not just happen via special-rf, i.e. from the special write of a special read-from to the corresponding special read. Rather, it can happen from any write of the same location and on the same thread that precedes (in program order) a special write.

How would you modify our model to include this notion?

Like so (where ? means zero or one step):

B Complete Bell and Cat Files

1.1 B.1 Kittens

Here’s kittens.bell :

Here’s kittens.cat :

1.2 B.2 Tiger

Here’s tiger.bell :

Here’s tiger.cat :

1.3 B.3 Jaguar

Here’s jaguar.bell :

Here’s jaguar.cat :

1.4 B.4 Panther

Here’s panther.bell :

Here’s panther.cat :

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Alglave, J. (2015). Modeling of Architectures. In: Bernardo, M., Johnsen, E. (eds) Formal Methods for Multicore Programming. SFM 2015. Lecture Notes in Computer Science(), vol 9104. Springer, Cham. https://doi.org/10.1007/978-3-319-18941-3_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-18941-3_3
Published: 07 May 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-18940-6
Online ISBN: 978-3-319-18941-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics