• Our paper will be presented at MLCAD 2020

    Our paper "Transfer-Learning for Design-Space Exploration with High-Level Synthesis" will be presented at the virtual ACM/IEEE Workshop on Machine Learning for CAD (MLCAD 2020) on November 20th. This work has been selected as a best paper award nominee.

    High-level synthesis (HLS) raises the level of design abstraction, expedites the process of hardware design, and enriches the set of final designs by automatically translating a behavioral specification into a hardware implementation. To obtain different implementations, HLS users can apply a variety of knobs, such as loop unrolling or function inlining, to particular code regions of the specification. The applied knob configuration significantly affects the synthesized design's performance and cost, e.g., application latency and area utilization. Hence, HLS users face the design-space exploration (DSE) problem, i.e. determine which knob configurations result in Pareto-optimal implementations in this multi-objective space. Whereas it can be costly in time and resources to run HLS flows with an enormous number of knob configurations, machine learning approaches can be employed to predict the performance and cost. Still, they require a sufficient number of sample HLS runs. To enhance the training performance and reduce the sample complexity, we propose a transfer learning approach that reuses the knowledge obtained from previously explored design spaces in exploring a new target design space. We develop a novel neural network model for mixed-sharing multi-domain transfer learning. Experimental results demonstrate that the proposed model outperforms both single-domain and hard-sharing models in predicting the performance and cost at early stages of HLS-driven DSE.

  • Overview paper on ESP to be presented at ICCAD 2020

    The new overview paper on ESP "Agile SoC Development with Open ESP" will be presented at ICCAD 2020 on November 3rd.

    Check out the ICCAD registration page, the pitch talk and the paper preprint. The event will take place on the Whova virtual platform, you can access live stream and video recordings here.

  • Upcoming tutorial on ESP at MICRO 2020

    We will present the tutorial "ESP: an Open-Source Platform for Agile SoC Development" at MICRO 2020.

    More info on the tutorial page on the ESP website. The tutorial material will be available there soon.

  • Our CRYLOGGER paper has been accepted at S&P2021

    Our paper "CRYLOGGER: Detecting Crypto Misuses Dynamically" has been accepted for publication in the proceedings of the IEEE Symposium on Security and Privacy.

    The paper describes CRYLOGGER, the first open-source tool that detects cryptographic (crypto) misuses in Android and Java applications. A crypto misuse is an invocation to a crypto API that does not respect common security guidelines, such as those suggested by cryptographers or organizations like NIST and IETF. To detect misuses, CRYLOGGER logs the parameters that are passed to the crypto APIs during the execution and checks their legitimacy offline by using a list of crypto rules. Differently from other approaches, it employs a dynamic approach, which does not require to analyze the code of the applications. We analyzed 1780 popular Android apps downloaded from the Google Play Store and showed that CRYLOGGER can detect crypto misuses on thousands of apps dynamically and automatically.

    To find out more about CRYLOGGER read our paper or check out the code on GitHub.

  • NVDLA+Ariane with ESP at CARRV 2020

    Our paper Ariane+NVDLA: Seamless Third-Party IP Integration with ESP will appear at CARRV.

    CARRV 2020 is the Fourth Workshop on Computer Architecture Research with RISC-V (co-located with ISCA) and it will take place virtually on Friday May 29th. The program of the workshop is available here and you can register for ISCA here.

    Paper, slides and video of the presentation are already available. The hands-on tutorial on this work will be available soon.

  • MosaicSim nominated for best paper award at ISPASS 2020

    We will present MosaicSim at ISPASS 2020. MosaicSim is a lightweight, modular heterogeneous system simulator for early-stage explorations of SoCs and HW-SW co-design leveragingLLVM IR and the ESP accelerator design flow.

    Check out the MosaicSim paper and open-source tool. This is a product of our collaboration with Princeton within the frame of the DECADES project.

  • Our HL5 invited paper will be presented at CICC'20

    Our invited paper "HL5: A 32-bit RISC-V Processor Designed with High-Level Synthesis" will be presented at CICC 2020 on March 23rd in Boston (MA).

  • Video proceedings of the ESP talk at the RISC-V Summit now available

    The video of the ESP talk at the RISC-V summit 2019 is now available on Youtube.

  • Two accepted talks on ESP at FOSDEM 2020 in Brussels

    We will give two talks about ESP in the RISC-V developer
    at FOSDEM in Brussels (Belgium) on February 1st.

    Follow the links above for more details and join us in Brussels!

  • Our ESP4ML paper has been accepted at DATE 2020

    Our paper "ESP4ML: Platform-Based Design of Systems-on-Chip for Embedded
    Machine Learning"
    has been accepted at DATE 2020.

    We will present it on March 11th in Grenoble (France).

  • Accepted tutorial on ESP at ASPLOS 2020

    We will present the tutorial "ESP: An Open-Source Platform for Interdisciplinary Research on SoC Design and
    at ASPLOS 2020 on March 17th in Lausanne (Switzerland).

    The tutorial material is available on the ESP website:

  • Upcoming talk on ESP at VLSI & ES 2020 in Bangalore

    Join us in Bangalore on January 5th for a talk about ESP at the International Conference on VLSI Design and International Conference on Embedded Design (VLSID & ES). The talk is part of the tutorial session "Invisible Computing: Embedded Systems". Here is an introductory video on Twitter.

  • KAIROS: Incremental Verification in High-Level Synthesis through Latency-Insensitive Design

    Our paper "KAIROS: Incremental Verification in High-Level Synthesis through Latency-Insensitive Design" has been accepted for publication in the proceedings of ACM Formal Methods in Computer-Aided Design. The paper is available here. The slides are available here.

    The paper describes KAIROS, an automatic methodology for incremental formal verification in high-level synthesis (HLS). KAIROS verifies the equivalence of the RTL implementations the designer subsequently derives from the same HLS-ready specification by applying code manipulations and knobs. We evaluate KAIROS by checking the equivalence of multiple RTL implementations of a hardware module and a RISC-V processor designed with HLS: KAIROS quickly detects bugs caused by wrong code manipulations and knob applications.

  • Upcoming tutorial on ESP at the ESWeek in New York

    Join us in New York on October 13th at the Embedded Systems Week for the first ever ESP tutorial!

    Here are the conference website and the tutorial details.

  • ESP Open-Source Release

    We announce the release of ESP, our RISC-V based platform for the seamless design of heterogeneous SoCs.

    ESP is an open-source research platform for the design of heterogeneous system-on-chip. The platform combines an architecture and a methodology. The flexible tile-based architecture simplifies the integration of heterogeneous components by balancing regularity and specialization. The companion methodology raises the level of abstraction to system-level design, thus promoting closer collaboration among software programmers and hardware engineers.

  • Two PhD students join the SLD Group

    The SLD Group welcomes two new PhD students.

    Maico Cassel Santos received his B.S. in Electrical Engineering and his M.S. in Microelectronics from the Universidade Federal do Rio Grande do Sul in Brazil. He has multiple years of working experience as a Digital IC Designer at Ceitec-SA and NSCAD in Brazil.

    Joseph Zuckerman received his B.S. in Electrical Engineering from Harvard University with a focus on hardware architectures for machine learning applications.

  • Our paper will be presented at DAC'19

    Our paper "A Learning-Based Recommender System for Autotuning Design Flows of Industrial High-Performance Processors" will be presented at Design Automation Conference (DAC) 2019. The slides of the presentation are now available.

    Logic synthesis and physical design (LSPD) tools automate complex design tasks previously performed by human designers. One time-consuming task that remains manual is configuring the LSPD flow parameters, which significantly impacts design results. To reduce the parameter-tuning effort, we propose an LSPD parameter recommender system that involves learning a collaborative prediction model through tensor decomposition and regression. Using a model trained with archived data from multiple state-of-the-art 14nm processors, we reduce the exploration cost while achieving comparable design quality. Furthermore, we demonstrate the transfer-learning properties of our approach by showing that this model can be successfully applied for 7nm designs.

  • Congratulations to Emilio for defending his thesis!

    Today Emilio successfully defended his thesis, titled "Scalable Emulation of Heterogeneous Systems". Congratulations, Dr. Cota!

  • Our paper on instrumenting QEMU to appear at VEE'19

    Our paper "Cross-ISA Machine Instrumentation Using Fast and Scalable Dynamic Binary Translation" has been accepted for publication at the upcoming VEE'19 conference to be held in Providence, RI. The paper presents Qelt, a cross-ISA emulator and instrumentation tool based on QEMU.

    Qelt implements three contributions:

    1. Fast cross-ISA floating point (FP) emulation by leveraging the host FP unit for most FP operations.
    2. A parallel, memory-efficient dynamic binary translation (DBT) engine that scales for multi-core guests that generate translated code in parallel.
    3. An ISA-agnostic instrumentation layer that converts a cross-ISA DBT engine into a low-overhead cross-ISA instrumentation tool.

    In addition, Qelt incorporates other state-of-the-art DBT techniques (e.g. indirect branch handling improvements, and dynamic TLB sizing in full-system mode) that further speed up emulation.

    Our results show that Qelt scales to 32 cores when emulating a guest machine used for parallel compilation, which demonstrates scalable code translation. Furthermore, experiments based on SPEC06 show that Qelt (1) outperforms QEMU as a full-system cross-ISA machine emulator by 1.76x/2.18x for integer/FP workloads, (2) outperforms state-of-the-art, cross-ISA, full-system instrumentation tools by 1.5x-3x, and (3) can match the performance of Pin, a state-of-the-art, same-ISA instrumentation tool, when used for complex instrumentation such as cache simulation.

    All of the features implemented in Qelt, except the instrumentation layer, have already been merged into upstream QEMU: indirect branch improvements are in QEMU v2.10, parallel translation is in v3.0, and our FP work as well as dynamic TLB sizing will be in v4.0, scheduled for release in April 2019. The code used in our evaluation can be found in this branch.

    We sincerely thank Richard Henderson, Alex Bennée and the rest of the QEMU community for their dependable guidance and extensive improvements to our work.

  • Our invited paper presented at ASPDAC'19

    Our paper "Runtime Reconfigurable Memory Hierarchy in Embedded Scalable Platforms" has been presented in Tokyo at ASPDAC 2019. Check out the slides of the presentation. This work is part of the DECADES project.

    In heterogeneous systems-on-chip, the optimal choice of the cache-coherence model for a loosely-coupled accelerator may vary at each invocation, depending on workload and system status. We propose a runtime adaptive algorithm to manage the coherence of accelerators. The algorithm’s choices are based on the combination of static and dynamic features of the active accelerators and their workloads. We evaluate the algorithm by leveraging our FPGA-based platform for rapid SoC prototyping. Experimental results, obtained through the deployment of a multi-core and multiaccelerator system that runs Linux SMP, show the benefits of our approach in terms of execution time and memory accesses.

  • Our paper in the IEEE Micro's special issue on hardware acceleration

    Our paper "Accelerators and Coherence: An SoC Perspective" has been published on IEEE Micro's special issue on hardware acceleration. This work is part of the DECADES project.

    The complexity of System-on-Chip designs continues to grow as each SoC features an increasing variety of loosely-coupled accelerators together with multiple processor cores. Specialized-hardware accelerators are typically designed in isolation, optimized for the algorithm they are implementing, and with limited consideration of the implications of their integration into a given SoC. However, the interaction between these accelerators and the memory hierarchy is critically important for their performance and the performance of the overall SoC. By leveraging our platform for rapid SoC prototyping, we analyze three models of coherence for loosely-coupled accelerators from a system-level perspective.

  • Our paper on cache coherence for accelerators presented at NOCS'18

    Our paper "NoC-Based Support of Heterogeneous Cache-Coherence Models for Accelerators" has been presented at the NOCS'18 Symposium, during the Embedded Systems Week. The paper and the slides are now available.

    The paper shows that choosing the appropriate coherence model for accelerators can improve performance and decrease energy consumption. Moreover, the best coherence choice varies at runtime for each accelerator's invocation. For these reasons, we propose a cache-coherence protocol and a NoC-based architecture that supports: three models of coherence for accelerators, the runtime selection of the coherence model, the coexistence of heterogeneous coherence models for accelerators.

  • Our paper on supporting DIFT on RISC-V processors

    Our paper "Design and Implementation of a Dynamic Information Flow Tracking Architecture to Secure a RISC-V Core for IoT Applications" has been accepted for publication at the HPEC'18 Conference. The paper is available here.

    The paper describes the design and implementation of dynamic information flow tracking (DIFT) on a RISC-V processor core. DIFT is a security technique that tracks malicious data and control flows and guarantees that they are not exploited by attackers. Our approach supports robust and software-programmable policies that protect bare-metal applications against memory corruption attacks. Our approach has a small impact on application performances, while providing a fine-grain management of the tags required by DIFT. We implemented our approach on PULPino, an open-source platform for IoT applications.

  • The SLD Group welcomes two new PhD students

    As of this Fall, the SLD group counts two new members.

    Kuan-Lin Chiu received his M.S. in Electrical and Computer Engineering from UCLA and his B.S. in Engineering Science from the National Taiwan University.

    Guy Eichler received his B.A.Sc. in Electrical, Electronics and Communication Engineering from Technion, Israel Institute of Technology. In the past years he has been working as Design Verification Engineer at IBM.

  • COSMOS: a compositional design-space exploration methodology for hardware accelerators

    Our paper "COSMOS: Coordination of High-Level Synthesis and Memory Optimization for Hardware Accelerators" has been published in ACM Transactions on Embedded Computing Systems (TECS). The paper is available here. The paper has been presented at the ACM/IEEE International Conference on Hardware/Software Codesign and System Synthesis (CODES+ISSS). The slides of the presentation are available here.

    The paper describes COSMOS, an automatic methodology for the design-space exploration of complex hardware accelerators. COSMOS coordinates both high-level synthesis and memory optimization tools in a compositional way. Thanks to the co-design of datapath and memory, COSMOS produces a large set of Pareto-optimal implementations for each component of a given accelerator. Additionally, COSMOS leverages compositional design techniques to quickly converge to the desired trade-off point between cost and performance for the entire accelerator architecture.

  • In the News: Prof. Carloni interviewed on the coming engineering renaissance in chip design

    Luca Carloni gave an interview on the future of chip design and how his own research and teaching hope to contribute to the coming renaissance. Access here the full interview.

  • In the News: Our ongoing work on Brain-Computer Interfaces

    Our group is a member of a research team led by Prof. Ken Shepard working on Brain-Computer Interfaces. This work, funded by DARPA under the NESD program, was recently highlighted in Columbia Engineering's website. Access the news story here.

  • Congratulations to Paolo for defending his thesis!

    Today Paolo successfully defended his thesis, titled "Scalable System-on-Chip Design". Congratulations, Dr. Mantovani!

  • Congratulations to Young Jin for defending his thesis!

    Today Young Jin successfully defended his thesis, titled "Design and Optimization of Networks-on-Chip for Future Heterogeneous Systems-on-Chip". Congratulations, Dr. Yoon!

  • Prof. Carloni named IEEE Fellow

    Luca Carloni has been named Fellow of the Institute of Electrical and Electronics Engineers (IEEE) for "contributions to system-on-chip design automation and latency-insensitive design".

    Extraordinary record of accomplishments is required to reach the highest grade of IEEE membership. More information on Prof. Carloni's nomination and many professional achievements can be found in an article that made the headlines of the Computer Science webpage.

  • Our paper on parallelizing QEMU to appear at CGO'17

    Our paper "Cross-ISA Machine Emulation for Multicores" has been accepted for publication at the upcoming CGO'17 conference to be held in Austin, TX. The work described in the paper enables parallel execution of parallel cross-ISA workloads in QEMU. That is, multi-core hosts can be exploited to speed up the emulation of guests that are (1) multi-core systems or (2) multi-threaded user-mode programs.

    Code changes resulting from this work have started making their way to upstream QEMU. In particular, QEMU v2.7 includes (1) our improved hashing for the TB block hash table and (2) the implementation and use of QHT for scalable TB lookups (commit, commit); see the merge commit. Moreover, the following code has been merged for inclusion in the upcoming QEMU v2.8: (1) the use of the host's atomic instructions to emulate the guest's atomics (x86, arm, aarch64; merge commit), and (2) the necessary work to safely support multi-threaded execution (commit, commit; merge commit).

    This work would not have been possible without the QEMU community. In particular, Paolo Bonzini and Alex Bennée have made key contributions and are coauthors of the paper. Other QEMU developers such as Sergey Fedorov and Richard Henderson have been actively involved by writing code as well as making significant improvements to our ideas and code. We are very grateful for their help and their patience with us.

  • Two SLD papers presented at DAC'16

    Two papers coauthored by SLD members have been presented in June at DAC 2016 in Austin, TX.

    Paolo Mantovani, Emilio Cota, Christian Pilato and Giuseppe Di Guglielmo authored a paper titled "An FPGA-Based Infrastructure for Fine-Grained DVFS Analysis in High-Performance Embedded Systems".

    Prof. Carloni presented an invited paper titled "The Case for Embedded Scalable Platforms".

    More information is available in this article on the Computer Science website.

  • Prof. Carloni's view on data centers accelerators makes the Tech Design Forum.

    An issue on the Tech Design Forum, "Minimize memory moves for greener data centers", reports Prof. Carloni's recommendations on memory systems and data movement for data center hardware accelerators. This article reviews current trends on heterogeneuos architectures memory management with a main focus on work from DAC 2016, where our group presented two papers.

  • Prof. Carloni guest editor of a Proceedings of the IEEE special issue on Electronic Design Automation

    Proceedings of the IEEE has released a special issue titled "Design Automation of Electronic Systems" on the evolution and future of Electronic Design Automation (EDA). Luca Carloni is one of four "word-leading guest editors" as mentioned in this article on the prweb news center.

  • Two SLD papers accepted to DAC'15

    Two papers coauthored by SLD members have been accepted to DAC'15.

    YoungHoon authored a paper titled "ΣVP: Host-GPU Multiplexing for Efficient Simulation of Multiple Embedded GPUs on Virtual Platforms".

    Emilio, Paolo and Giuseppe worked on a paper titled "An Analysis of Accelerator Coupling in Heterogeneous Architectures".

    The papers will be presented in June at DAC 2015 in San Francisco, CA.

    Congratulations to all of you!

  • Article published in special issue of Design & Test

    IEEE Design & Test recently published a special issue on Cloud Computing for Embedded Systems, putting together interesting articles, particularly including the one coauthored by SLD members.

    This special issue includes three papers, “The Swarm at the Edge of the Cloud”, “Middleware for IoT-Cloud Integration Across Application Domains”, and “Cloud-Aided Design for Distributed Embedded Systems”.

    The last article, presented by YoungHoon Jung, Michele Petracca, and Prof. Carloni, proposes a virtual execution platform for designing an embedded system using the cloud. Using this method, it is possible to emulate the interactions of millions of embedded systems that run various applications, such as image processing.

    The special issue is available on the IEEE Explorer.

  • Best Paper Award at DATE 2012

    Hung-Yi Liu and Michele Petracca, along with Prof. Carloni, have received the Best Paper Award for their work "Compositional System-Level Design Exploration with Planning of High-Level Synthesis", presented at the 2012 Design, Automation, and Test in Europe (DATE 2012) conference. It was the only best paper award assigned out of 950 submissions.


  • Three SLD papers accepted to DAC'13

    Three papers coauthored by SLD members have been accepted to DAC'13.

    Hung-Yi worked on two of these accepted submissions; he is the main author of "On Learning-Based Methods for Design-Space Exploration with High-Level Synthesis", and is the second author of "A Method to Abstract RTL IP Blocks into C++ Code and Enable High-Level Synthesis", whose first author is Nicola Bombieri. Nicola visited us last year from the University of Verona, where he is an Assistant Professor.

    YoungHoon is the main author of the third SLD accepted paper, titled "netShip: A Networked Virtual Platform for Large-Scale Heterogeneous Distributed Embedded Systems".

    The papers will be presented in June at DAC 2013 in Austin, TX.

    Congratulations to Hung-Yi and YoungHoon!

  • Best Paper Award at IEEE CloudCom

    YoungHoon Jung, Richard Neill and Luca Carloni have received the Best Paper Award for their work “A Broadband Embedded Computing System for MapReduce Utilizing Hadoop” presented at the 4th IEEE International Conference on Cloud Computing Technology and Science (CloudCom 2012) held in Taipei, Taiwan. The work was selected among the fifty-four papers accepted to the conference, which had a 17% acceptance rate.

  • Best in Session Award at SRC TECHCON 2012

    Young Jin Yoon received the Best in Session Award at the SRC TECHCON 2012 for the presentation and poster titled VENTTI: A Vertically Integrated Framework for Simulation and Validation of Networks-on-Chip. Congratulations!

  • Accelerator Memory Reuse in Computer Architecture Letters

    Our paper titled "Accelerator Memory Reuse in the Dark Silicon Era", which describes the reuse of accelerator memory as on-chip cache in the context of many-core architectures, has been accepted for publication and will soon be available as a Computer Architecture Letter.

  • In the Press: Business Wire cites SLD work

    Michele Petracca and Professor Carloni were cited by Business Wire for their work on embedded voltage regulators, developed in collaboration with Professor Ken Shepard’s group.

    Read more at Business Wire: Columbia University and Semiconductor Research Corporation Breathe New Life into Scalability by Integrating Voltage Regulators Directly onto ICs.

  • Hung-Yi's paper on Compositional System-Level Design Exploration accepted at DATE'12

    Congratulations to Hung-Yi for getting his paper, Compositional System-Level Design Exploration with Planning of High-Level Synthesis, accepted at the 2012 edition of the Design, Automation and Test Conference in Europe (DATE).

  • Best Student Demo Award at ACM SenSys 2011

    The demo titled Organic Solar Cell-equipped Energy Harvesting Active Networked Tag (EnHANT) Prototypes received the Best Student Demo Award at the ACM Conference on Embedded Networked Sensor Systems. The demo was developed by 10 students from several EE/CS groups at Columbia, with Marcin Sczcodrak representing SLD. Congratulations!

  • Prof. Carloni wins Investigator Award

    Luca Carloni received a three-year research grant as one of the recipients of the 2010 Young Investigator Program from the Office of Naval Research. The winning proposal is titled "Methods for System-level Design and Programming of Heterogeneous Embedded Multi-core Platforms". More information is available here.