UBC Home Page -
UBC Home Page -
UBC Home Page UBC Home Page -
-
-
News Events Directories Search UBC myUBC Login
-
- -
HOME
Research
Publications
Teaching
Group
Bio
-

Publications

 

My Google Scholar profile is here.

 

Type Year Venue

Paper

C.26

2014

HPCA

Ahmed ElTantawy, Jessica Wenjie Ma, Mike O'Connor, Tor M. Aamodt, A Scalable Multi-Path Microarchitecture for Efficient GPU Control Flow, to appear in proceedings of the 20th IEEE International Symposium on High-Performance Computer Architecture (HPCA-20), Orlando, FL, February 15-19, 2014.

C.25

2013

MICRO

Wilson W. L. Fung, Tor M. Aamodt, Energy Efficient GPU Transactional Memory via Space-Time Optimizations, in proceedings of the 46th IEEE/ACM International Symposium on Microarchitecture (MICRO-46), pp. 408-420, Davis, CA, December 7-11, 2013. (acceptance rate: 39/239 ≈ 16.3%), simulator code, slides

C.24

2013

MICRO

Timothy G. Rogers, Mike O'Connor, Tor M. Aamodt, Divergence-Aware Warp Scheduling, in proceedings of the 46th IEEE/ACM International Symposium on Microarchitecture (MICRO-46), pp. 99-110, Davis, CA, December 7-11, 2013. (acceptance rate: 39/239 ≈ 16.3%), slides

C.23

2013

ISCA

Jingwen Leng, Tayler Hetherington, Ahmed ElTantawy, Syed Gilani, Nam Sung Kim, Tor M. Aamodt, Vijay Janapa Reddi, GPUWattch: Enabling Energy Optimizations in GPGPUs, In proceedings of the ACM/IEEE International Symposium on Computer Architecture (ISCA 2013), pp. 487-498, Tel-Aviv, Israel, June 23-27, 2013. (acceptance rate: 56/288 ≈ 19.4%), GPUWattch is included in GPGPU-Sim 3.2.1 onward

C.22

2013

DATE

Vitaly Zakharenko, Tor M. Aamodt, Andreas Moshovos, Characterizing the Performance Benefits of Fused CPU/GPU Systems Using FusionSim, Design, Automation and Test in Europe (DATE), pp. 685-688, Grenoble, France, 18-22 March, 2013. (interactive presentation) FusionSim website

C.21

2013

ASPLOS

Hadi Jooybar, Wilson W. L. Fung, Mike O'Connor, Joseph Devietti, Tor M. Aamodt, GPUDet: A Deterministic GPU Architecture, In proceedings of the Eighteenth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2013), pp. 1-12, Houston, Texas, March 16-20, 2013. (acceptance rate: 44/191 ≈ 23.0%) slides, simulator code+benchmarks

C.20

2013

HPCA

Inderpreet Singh, Arrvindh Shriraman, Wilson W. L. Fung, Mike O'Connor, Tor M. Aamodt, Cache Coherence for GPU Architectures, In proceedings of the 19th IEEE International Symposium on High-Performance Computer Architecture (HPCA-19), pp. 578-590, Shenzhen, China, February 23-27, 2013. simulator code, benchmarks, slides (acceptance rate: 51/249 ≈ 20.5%) Selected for IEEE Micro Top Picks

C.19

2012

FPT

Jimmy Kwa, Tor M. Aamodt, Small Virtual Channel Routers on FPGAs Through Block RAM Sharing In proceedings of the IEEE International Conference on Field Programmable Technology (FPT), pp. 71-79, Seoul, Korea, December 10-12, 2012. download RTL (acceptance rate: 24/114 ≈ 21.1%)

C.18

2012

MICRO

Timothy G. Rogers, Mike O'Connor, Tor M. Aamodt, Cache-Conscious Wavefront Scheduling, In proceedings of the 45th IEEE/ACM International Symposium on Microarchitecture (MICRO-45), pp. 72-83, Vancouver, BC, December 1-5, 2012. (acceptance rate: 40/228 ≈ 17.5%) Best paper runner up, Selected for IEEE Micro Top Picks, CACM Research Highlight, simulator core + benchmarks

C.17

2012

ISPASS

Tayler H. Hetherington, Timothy G. Rogers, Lisa Hsu, Mike O'Connor, Tor M. Aamodt, Characterizing and Evaluating a Key-Value Store Application on Heterogeneous CPU-GPU Systems, In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 88-98, New Brunswick, NJ, April 1-3, 2012. download code, slides. (acceptance rate: 20/65 ≈ 30.8%)

C.16

2011

MICRO

Wilson W. L. Fung, Inderpreet Singh, Andrew Brownsword, Tor M. Aamodt, Hardware Transactional Memory for GPU Architectures, In proceedings of the 44th IEEE/ACM International Symposium on Microarchitecture (MICRO-44), pp. 296-307, Porto Alegre, Brazil, December 3-7, 2011. slides, longer talk, simulator as used in MICRO 2011 paper, simulator with recent changes to GPGPU-Sim 3.x, benchmarks (acceptance rate: 44/209 ≈ 21.0%) Selected for IEEE Micro Top Picks

C.15

2011

HPCA

Wilson W. L. Fung, Tor M. Aamodt, Thread Block Compaction for Efficient SIMT Control Flow, In proceedings of the 17th IEEE International Symposium on High-Performance Computer Architecture (HPCA-17), pp. 25-36, San Antonio, Texas, February 12-16 2011. pre-print, slides, simulator code (acceptance rate: 42/227 ≈ 18.5%)

C.14

2010

MICRO

Ali Bakhoda, John Kim, Tor M. Aamodt, Throughput-Effective On-Chip Networks for Manycore Accelerators, In proceedings of the 43rd IEEE/ACM International Symposium on Microarchitecture (MICRO-43), pp. 421-432, Atlanta, Georgia, December 4-8, 2010. pre-print, BibTeX (acceptance rate: 45/248 ≈ 18.1%)

C.13

2010

PACT

Ali Bakhoda, John Kim, Tor M. Aamodt, On-Chip Network Design Considerations for Compute Accelerators, In Nineteenth International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 535-536, Vienna, Austria, September 11-15, 2010. pre-print, BibTeX Best poster award, 2nd place

C.12

2010

ISPASS

Aaron Ariel, Wilson W. L. Fung, Andrew Turner, Tor M. Aamodt, Visualizing Complex Dynamics in Many-Core Accelerator Architectures, In Proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 164-174, White Plains, NY, March 28-30, 2010. pre-print, BibTeX (acceptance rate: 22/64 ≈ 34.4%)

C.11

2010

ISQED

Johnny Kuan, Steve J. E. Wilton, Tor M. Aamodt, Accelerating Trace Computation in Post-Silicon Debug, In Proceedings of the 11th IEEE International Symposium on Quality Electronic Design (ISQED 2010), pp. 244-249, San Jose, CA, March 22-24, 2010. pre-print, BibTeX (poster presentation)

C.10

2009

MICRO

George L. Yuan, Ali Bakhoda, Tor M. Aamodt, Complexity Effective Memory Access Scheduling for Many-Core Accelerator Architectures, In proceedings of the 42nd IEEE/ACM International Symposium on Microarchitecture (MICRO-42), pp. 34-44, New York, NY, December 12-16, 2009. slides pre-print, BibTeX (acceptance rate: 52/209 ≈ 24.9%)

C.9

2009

ISPASS

Ali Bakhoda, George L. Yuan, Wilson W. L. Fung, Henry Wong, Tor M. Aamodt, Analyzing CUDA Workloads Using a Detailed GPU Simulator, In proceedings of the IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 163-174, Boston, MA, April 26-28, 2009. slides pre-print, simulator, BibTeX (acceptance rate: 24/86 ≈ 27.9%)

C.8

2009

HPCA

Xi E. Chen and Tor M. Aamodt, A First-Order Fine-Grained Multithreaded Throughput Model, In proceedings of the 15th IEEE International Symposium on High-Performance Computer Architecture (HPCA-15), pp. 329-340, Raleigh, North Carolina, February 14-18, 2009 hpca pre-print, BibTeX, (acceptance rate: 35/184 ≈ 19.0%) -- journal version

C.7

2008

MICRO

Xi E. Chen and Tor M. Aamodt, Hybrid Analytical Modeling of Pending Cache Hits, Data Prefetching, and MSHRs, In proceedings of the 41st IEEE/ACM International Symposium on Microarchitecture (MICRO-41), pp. 59-70, Lake Como, Italy, November 8-12, 2008. pre-print, BibTeX (acceptance rate: 40/210 ≈ 19.0%)

C.6

2008

PACT

Henry Wong, Anne Bracy, Ethan Schuchman, Tor M. Aamodt, Jamison D. Collins, Perry H. Wang, Gautham Chinya, Ankur Khandelwal Groen, Hong Jiang, and Hong Wang, Pangaea: A Tightly-Coupled IA32 Heterogeneous Chip Multiprocessor, In proceedings of the 17th IEEE/ACM International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 52-61, Toronto, ON, October 25-29, 2008. pre-print, BibTeX (acceptance rate: 30/159 ≈ 18.9%)

C.5

2007

MICRO

Wilson W. L. Fung, Ivan Sham, George Yuan, and Tor M. Aamodt, Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow, In proceedings of the 40th IEEE/ACM International Symposium on Microarchitecture (MICRO-40), pp. 407-418, Chicago, IL, December 1-5, 2007. slides. pre-print, BibTeX (acceptance rate: 35/166 ≈ 21.1%)

C.4

2007

ICS

Tor M. Aamodt and Paul Chow, Optimization of Data Prefetch Helper Threads with Path-Expression Based Statistical Modeling, In proceedings of the 21st ACM International Conference on Supercomputing (ICS), pp. 210-221, Seattle, WA, June 16-20, 2007. BibTeX (acceptance rate: 29/123 ≈ 23.6%)

C.3

2004

HPCA

Tor M. Aamodt, Paul Chow, Per Hammarlund, Hong Wang, and John P. Shen, Hardware Support for Prescient Instruction Prefetch, In proceedings of the 10th IEEE International Symposium on High Performance Computer Architecture (HPCA-10), pp. 84-95, Madrid, Spain, February 14-18, 2004. (acceptance rate: 27/153 ≈ 17.6%) BibTeX

C.2

2003

METRICS

Tor M. Aamodt, Pedro Marcuello, Paul Chow, Antonio Gonzalez, Per Hammarlund, Hong Wang, and John P. Shen, A Framework for Modeling and Optimization of Prescient Instruction Prefetch, In proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems(SIGMETRICS 2003), pp. 13-24, San Diego, CA, June 10-14, 2003. slides. BibTeX (acceptance rate: 26/222 ≈ 11.7%)

C.1

2000

CASES

Tor Aamodt and Paul Chow, Embedded ISA Support for Enhanced Floating-Point to Fixed-Point ANSI C Compilation, In proceedings of the 3rd ACM International Conference on Compilers, Architectures and Synthesis for Embedded Systems (CASES-2000), pp. 128-137, San Jose, CA, November 17-18, 2000. slides. (acceptance rate: 25/56 ≈ 44.6%)

J.11

In Press

CAL

Milad Mohammadi, Song Hang, Tor M. Aamodt, William J. Dally, On-Demand Dynamic Branch Prediction, to appear in IEEE Computer Architecture Letters

J.10

In Press

CACM

Timothy G. Rogers, Mike O'Connor, Tor M. Aamodt, Learning Your Limit: Managing Massively Multithreaded Caches Through Scheduling, to appear in Communications of the ACM.

J.9

2014

Top Picks

Inderpreet Singh, Arrvindh Shriraman, Wilson W. L. Fung, Mike O'Connor, Tor M. Aamodt, Cache Coherence for GPU Architectures, IEEE Micro, Special Issue: Micro's Top Picks from 2013 Computer Architecture Conferences, May/June 2014.

J.8

2013

TACO

Ali Bakhoda, John Kim, Tor M. Aamodt, Designing On-Chip Networks for Throughput Accelerators, ACM Transactions on Architecture and Code Optimization (TACO), Vol. 10, No. 3, Article 21, September 2013.

J.7

2013

Top Picks

Timothy G. Rogers, Mike O'Connor, Tor M. Aamodt, Cache-Conscious Thread Scheduling for Massively Multithreaded Processors, IEEE Micro, Special Issue: Micro's Top Picks from 2012 Computer Architecture Conferences, Vo. 33, No. 3, pp. 78-85, May/June 2013.

J.6

2012

TVLSI

Marcel Gort, Flavio M. De Paula, Johnny J.W. Kuan, Tor M. Aamodt, Alan J. Hu, Steven J.E. Wilton, Jin Yang, Formal-Analysis-Based Trace Computation for Post-Silicon Debug, IEEE Transactions on Very Large Scale Integration Systems, Vol. 20, No. 11, pp. 1997-2010, November 2012.

J.5

2012

TC

Xi E. Chen and Tor M. Aamodt, Modeling Cache Contention and Throughput of Multiprogrammed Manycore Processors, IEEE Transactions on Computers, Vol. 61, No. 7, pp. 913-927, July 2012.

J.4

2012

Top Picks

Wilson W. L. Fung, Inderpreet Singh, Andrew Brownsword, Tor M. Aamodt, Kilo TM: Hardware Transactional Memory for GPU Architectures, IEEE Micro, Special Issue: Micro's Top Picks from 2011 Computer Architecture Conferences, Vol. 32, No. 3, pp. 7-16, May/June 2012.

J.3

2011

TACO

Xi E. Chen and Tor M. Aamodt, Hybrid Analytical Modeling of Pending Cache Hits, Data Prefetching, and MSHRs, ACM Transactions on Architecture and Code Optimization (TACO), Vol. 8, No. 3, Article 10 (October 2011), 28 pages.

J.2

2009

TACO

Wilson W. L. Fung, Ivan Sham, George Yuan, and Tor M. Aamodt, Dynamic Warp Formation: Efficient MIMD Control Flow on SIMD Graphics Hardware, ACM Transactions on Architecture and Code Optimization (TACO), Vol. 6, No. 2, Article 7 (June 2009), 37 pages. BibTeX

J.1

2008

TECS

Tor M. Aamodt, Paul Chow, Compile-Time and Instruction Set Methods for Improving Floating- to Fixed-Point Conversion Accuracy, ACM Transactions on Embedded Computing Systems (TECS), Vol. 7, No. 3, Article 26 (April 2008), 27 pages.

I.1

2009

PACRIM

Tor M. Aamodt, Architecting Graphics Processors for Non-Graphics Compute Acceleration, In proceedings of the 2009 IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, Special Session on Computer Architecture (PACRIM-09), Victoria, BC, August 23-26, 2009. (invited paper)

P.5

2010

Tor M. Aamodt, Hong Wang, Per Hammarlund, John P. Shen, Steve Shih-wei Liao, Perry H. Wang, Method and apparatus for efficient resource utilization for prescient instruction prefetch, United States Patent #7,818,547, Issued October 19, 2010. Assignee: Intel Corporation.

P.4

2010

Hong Wang, Tor M. Aamodt, Pedro Marcuello, Jared W. Stark, John P. Shen, Antonio Gonzalez, Per Hammarlund, Gerolf F. Hoflehner, Perry H. Wang, Steve Shih-wei Liao, Speculative multi-threading for instruction prefetch and/or trace pre-build, United States Patent #7,814,469, Issued October 12, 2010. Assignee: Intel Corporation.

P.3

2010

Hong Wang, Tor Aamodt, Per Hammarlund, John P. Shen, Xinmin Tian, Milind Girkar, Perry Wang, Steve Shih-wei Liao Safe store for speculative helper threads, United States Patent #7,657,880, Issued February 2, 2010. Assignee: Intel Corporation.

P.2

2009

Tor M. Aamodt, Hong Wang, John P. Shen, Per Hammarlund, Methods and apparatus for generating speculative helper thread spawn-target points, United States Patent #7,523,465, Issued April 21, 2009. Assignee: Intel Corporation.

P.1

2008

Tor M. Aamodt, Hong Wang, Per Hammarlund, John P. Shen, Steve Shih-wei Liao, Perry H. Wang, Method and apparatus for efficient utilization for prescient instruction prefetch, United States Patent #7,404,067, Issued July 22, 2008. Assignee: Intel Corporation.

W.8

2011

MTV

Johnny J.W. Kuan, Tor M. Aamodt, Progressive-BackSpace: Efficient Predecessor Computation for Post-Silicon Debug, In proceedings of the 12th IEEE International Workshop on Microprocessor Test and Verification), (MTV 2011), Austin, TX, December 5-7, 2011.

W.7

2009

WDDD

Henry Wong and Tor M. Aamodt, The Performance Potential for Single Application Heterogeneous Systems, 8th Annual Workshop on Duplicating, Deconstructing, and Debunking (WDDD 2009), (in conjunction with ISCA 2009), Austin, Texas, June 21, 2009. slides

W.6

2009

MoBS

George L. Yuan and Tor M. Aamodt, A Hybrid Analytical DRAM Performance Model, 5th Workshop on Modeling, Benchmarking and Simulation (MoBS 2009), (in conjunction with ISCA 2009), Austin, Texas, June 21, 2009.

W.5

2008

MoBS

Xi E. Chen and Tor M. Aamodt, An Improved Analytical Superscalar Microprocessor Memory Model, 4th Workshop on Modeling, Benchmarking and Simulation (MoBS 2008), (in conjunction with ISCA 2008), pp. 7-16, Beijing, China, June 22, 2008.

W.4

2008

CMP-MSI

Ali Bakhoda and Tor M. Aamodt, Extending the Scalability of Single Chip Stream Processors with On-chip Caches, 2nd Workshop on Chip Multiprocessor Memory Systems and Interconnects (CMP-MSI 2008), (in conjunction with ISCA 2008), 9 pages, Beijing, China, June 22, 2008.

W.3

2002

MTEAC

Tor Aamodt, Pedro Marcuello, Paul Chow, Per Hammarlund, and Hong Wang, Prescient Instruction Prefetch, MTEAC-6, (in conjunction with MICRO-35 ), pp. 3-10, Istanbul Turkey, November 2002. Best student paper award

W.2

2001

MTEAC

Tor Aamodt, Andreas Moshovos, and Paul Chow, The Predictability of Computations that Produce Unpredictable Outcomes, MTEAC-5, (in conjunction with MICRO-34 ), pp. 23-34, Austin Texas, December 2001. slides

W.1

1999

MPDSP

Tor Aamodt and Paul Chow, Numerical Error Minimizing Floating-Point to Fixed-Point ANSI C Compilation, MPDSP-1 (in conjunction with MICRO-32), pp. 3-12, Haifa Israel, November 1999. slides

TR.3

2012

Wilson W. L. Fung, Inderpreet Singh, and Tor M. Aamodt, Kilo TM Correctness: ABA Tolerance and Validation-Commit Indivisibility, Technical Report, University of British Columbia, 24 May 2012.

TR.2

2007

Owen Kirby, Shahriar Mirabbasi, and Tor M. Aamodt, Mixed-Signal Neural Network Branch Prediction, Technical Report, University of British Columbia, 8 June 2007.

TR.1

2001

Tor Aamodt, Andreas Moshovos, and Paul Chow, The Predictability of Computations that Produce Unpredictable Outcomes Technical Report #TR-01-08-01, EECG, University of Toronto, August 2001.

T.3

2006

Tor M. Aamodt, Modeling and Optimization of Speculative Threads, Doctoral Thesis, University of Toronto, 2006.

T.2

2001

Tor Aamodt, Floating-Point to Fixed-Point Compilation and Embedded Architectural Support, Masters Thesis, University of Toronto, January 2001.

T.1

1997

Tor Aamodt, Intelligent Control via Reinforcement Learning: State Transfer and Stabilization of a Rotational Inverted Pendulum, Bachelors Thesis, University of Toronto, April 1997.

Talks

 

2013

Efficient and Easily Programmable Accelerator Architectures PDF (Stanford PPL Retreat)

2012

Evolving GPUs into a Substrate for Cloud Computing (Microsoft Research)

2011

Hardware Transactional Memory for GPU Architectures (AMD, Rambus, Intel, NVIDIA)

2011

GPU Architecture Challenges for Throughput Computing (UVic, USask, EPFL, UofT, Qualcomm)

2008

Leveraging Fine-Grained Multithreading for Efficient SIMD Control Flow (Microsoft Research)

-

to top | UBC.ca » ECE » Tor Aamodt

Department of Electrical and Computer Engineering

tel 604.827.4116 | e-mail aamodt@ece.ubc.ca

© Copyright The University of British Columbia, all rights reserved.