Hi, I'm Osama.



GW : : Computer Engineering

I'm likely


Greetings!


  • I am a third year PhD student in Computer Engineering at The George Washington University (GW), advised by Dr. Gina Adam.
  • I have a Bachelors degree in Computer Science with a minor in Mathematics from Habib University, Pakistan.
  • Interests: Neuromorphic Computing, Machine Learning, Data Science, Hardware-Software Co-Design.

Currently,


Open to Intern!


  • I am actively looking for a remote/in-person internship for Summer 2024 in the United States.
  • I am interested in, and posses demonstrable expertise in the following areas:
    • Machine Learning
    • Data Science & Engineering
    • Software Design & Engineering
    • FPGA Prototyping
  • Please reach out if you have any relevant roles in your organization!

Research Projects


Daffodil: A Resistive Neural Network Benchmarking Platform

Description:
Developing Daffodil, a novel prototyping platform for benchmarking resistive memory-based crossbar neural networks. It consists of a custom PCB housing an array of up to 20k emerging memory devices and connects to a Zynq-based FPGA host.
Technologies:
Python, C, Verilog, Vivado, Uboot
Research Outcomes:
  • Hoskins, B. D., Ma, W., Fream, M., Yousuf, O., Daniels, M. W., Goodwill, J., Madhavan, A., Tung, H., Branstad, M., Liu, M., Madsen, R., McClelland, J., Adam, G.C., Lueker-Boden, M. (2021). A System for Validating Resistive Neural Network Prototypes. In International Conference on Neuromorphic Systems (ICONS), July 2021, https://doi.org/10.1145/3477145.3477260.
  • Yousuf, O., Borders, W.A., Hoskins, B.D., Madhavan, A., Ramu, K., Adam, G.C. (2023). Experimental Demonstration of Inference with ReRAM devices using Daffodil: A Mixed-Signal Prototyping Platform. In plan for Nature Communications: Neuromorphic Hardware and Computing 2023.


A Streaming Hardware Architecture for Training Neuromorphic Arrays

Description:
Assisting in the design, characterization, and testing of a hardware-aware streaming singular value decomposition (SVD) algorithm for supervised and unsupervised learning over matrix manifolds.
Technologies:
Python, PyTorch, C++, pyBind11
Research Outcomes:
  • Daniels, M. W., Hoskins, B. D., Madhavan, A., Yousuf, O., Adam, G. C., Branstad, M., Tung, H., Madsen, R., Lueker-Boden, M., McClelland, J., Stiles, M. D. Quasisystolic Arrays for Pipelined and Resource-Efficient Neural Network Training. Best Poster at Sigma Xi NIST: AI, Machine Learning, Engineering, Nanotechnology, and Math, March 2021.
  • Daniels, M. W., Hoskins, B. D., Madhavan, A., Yousuf, O., Adam, G. C., Lueker-Boden, M., McClelland, J., Stiles, M. D. A Quasisystolic ASIC for Pipelined and Resource-Efficient Neural Network Training with Dense Neuromorphic Arrays. In plan for Physical Review Applied, 2024.


Memristive Device & Neural Network Modeling

Description:
Investigated a statistical modeling approach known as jump table modeling. Utilized synthetic and experimental models for efficiently simulating characteristics of a population of emerging memory devices (RRAM, FeFETs) such as conductance vs. applied voltage pulse. Developed a framework for training deep networks where layer weights are modeled by a pair of device jump tables, in turn simulating the training of a crossbar of emerging memory devices.
Technologies:
Python, PyTorch, R, Mathematica
Research Outcomes:
  • Yousuf, O., Hossen, I., Daniels, M. W., Lueker-Boden, M., Dienstfrey, A., Adam, G.C. (2023). Device Modeling Bias in ReRAM-based Neural Network Simulations. In IEEE Journal on Emerging and Selected Topics in Circuits and Systems, January 2023, doi: 10.1109/JETCAS.2023.3238295.
  • Yousuf, O., Hossen, I., Daniels, M. W., Lueker-Boden, M., Dienstfrey, A., Adam, G.C. (2022). Investigating Bias in the Modeling of ReRAM Devices. Poster Presentation at International Conference on Memristive Materials, Devices & Systems (MEMRISYS), November-December 2022.
  • Yousuf, O., Hossen, I., Glasmann, A.L., Najmaei, S., Adam, G.C. (2023). Neural Network Modeling Bias for Hafnia-based FeFETs. Submitted to International Symposium on Nanoscale Architectures (NANOARCH) 2023.


Algorithms for Network Gradient Decomposition

Description:
Explored various low-rank, streaming matrix decomposition methods and investigated their applicability in efficiently training crossbar-based hardware neural networks.
Technologies:
Python, C++, Mathematica
Research Outcomes:
  • Zhao, J., Huang, S., Yousuf, O., Gao, Y., Hoskins, B. D., Adam, G.C. (2021). Gradient Decomposition Methods for Training Neural Networks with Non-Ideal Synaptic Devices. In Frontiers in Neuroscience: Neuromorphic Computing 2021, doi: 10.3389/fnins.2021.749811.
  • Yousuf, O., Hossen, I., Daniels, M. W., Lueker-Boden, M., Dienstfrey, A., Adam, G.C. (2022). Investigating Bias in the Modeling of ReRAM Devices. Poster Presentation at International Conference on Memristive Materials, Devices & Systems (MEMRISYS), November-December 2022.
  • Yousuf, O., Daniels, M. W., Dienstfrey, A., Adam, G.C. (2022). Towards a Hardware-Aware Decomposition Method for ReRAM Neural Network Training. Oral Presentation at International Conference on Neuromorphic Systems (ICONS), July 2022.
  • Yousuf, O., Daniels, M. W., Dienstfrey, A., Adam, G.C. (2022). Streaming Gradient Tracking using Non-negative Matrix Factorization. Poster Presentation at GW Research Showcase, George Washington University, April 2022.

Other Projects


End-to-End Deep Neural Network Inference on an FPGA

Description:
Wrote RTL modules for quantized vector-matrix multiplication and non-linear activation functions using Verilog. Demonstrated successful inference on MNIST using pre-trained weights of a 2-layer perceptron network. Deployed the solution on to a Zynq-based FPGA using the hls4ml framework.
Technologies:
Verilog, Python, Vivado, hls4ml (for high-level synthesis)
Demo: Video


A Reinforcement Learning Framework for Autonomous Vehicles

Description:
In Summer 2019, I interned at the Pi Star AI and Optimization Lab at Texas A&M University as a research scholar, studying and simulating various machine learning algorithms and their efficiency in the context of autonomous driving. I led the development of the core framework for the simulation-side of the project and worked with a team of graduates as well as post-graduates. This framework was based on the CARLA simulator and compatible with OpenAI's Gym library.
Technologies:
Python, Gym, CARLA
Code: GitHub


Personal Outfit Recommendation System

Description:
A fully functional AI-powered fashion e-commerce web application built using a modern and efficient technology stack. The system can recommend outfits from brands based on input user images and ongoing trends through an assortment of different image processing and artificial intelligence algorithms.
Technologies:
Django and Python for the back-end, React-TS/JS for the front-end, PostGreSQL as the database ORM, and GraphQL for the client-server API - implemented using Apollo Client at the front-end and Graphene Django at the back-end, and PyTorch and Tensorflow for the recommendation engine.
Demo: Video Code: GitHub


Ant Colony Optimization (ACO)

Description:
An interactive simulation built using principles of Object-Oriented Programming for visualizing the solution to the shortest path problem through ACO. Parameters include ant density, dynamic obstacles, solution randomness, and pheromone visibility. Included artwork was also self-designed to add a tint of personalization.
Technologies:
Processing for Python.
Demo: Video Code: GitHub


Multi-feature Raytracer

Description:
A feature-rich raytracer written entirely in vanilla C++ that can render flawless stills from any imported 3D model. Salient features include a variety of different lighting sources, a well-grounded array of materials, as well as a few shading options. On top, it also makes use of multithreading and few industry-grade and efficient data structures for accelerating the process of ray-tracing.
Technologies:
Vanilla C++ (std:C++17)
Demo: Blog


Flight Simulator

Description:
A simple fly-by view over an infinite terrain generated through Perlin Noise. The project takes use of the traditional pipeline rendering process in Computer Graphics. Vertex shaders, fragment shaders, flat and smooth shading, wireframe generation, procedural terrain generation, affine transformations and various schemes were explored and implemented.
Technologies:
JavaScript, HTML, CSS, WebGL 1.0


Finite Element Solver

Description:
A complete end-to-end finite element solver for a given system modelled by the Poisson's equation. The project implements the global stiffness matrix procedure and solves for a given mesh file under given boundary conditions.
Technologies:
Python with NumPy and Matplotlib.


3D-Humanoid Walk Simulation

Description:
The project implements two well-known approaches in machine learning, namely, behavioral cloning or imitation learning, and reinforcement learning, on the Humanoid-v2 environment provided by the MuJoCo-200 binding for LinuxOS, which comes with OpenAI’s GYM package for Python3. The goal of this environment is to train the humanoid to perform a balanced walk from scratch – one that doesn’t deviate from its initial straight trajectory. The two approaches were implemented and their pros and cons on the model were analyzed.
Technologies:
Python with OpenAI GYM, MuJoCo-200.
Demo: Video Code: GitHub


RISC-V Processor

Description:
A complete single-cycle datapath processor based on the RISC-V architecture using a module-first bottom-to-top approach. The processor can execute a specific set of instructions, which include R-Type, I-Type, and Branch-Type instructions. The main modules include a register file, a 64-bit ALU, data memory, instruction memory, program counter, a full adder, an ALU controller, a control unit, an immediate bit generator, a multiplexer, and an instruction parser.
Technologies:
Verilog


University Management System

Description:
A simple and elegant university management system offering separate faculty, admin, and student portals. A student can enroll in given courses for each term upto a specific amount of credits set by the admin. Students can also view their grade sheets for enrolled terms, and the faculty can assign assignment and course grades to students. The admin can monitor and oversee all these processes.
Technologies:
MySQL, Microsoft Visual Studio, Windows Forms, DB Designer.


Brick Crumble - A PC Videogame

Description:
A remake of the classic Arkanoid i.e. Brick breaker videogame, written in under a month. An extremely high emphasis was placed on design principles and Object-Oriented Programming (OOP). Classes were written for each object in the game, and the interfaces are decoupled from the implementation, so as to maximize encapsulation and abstraction. Inheritance was consistently implemented based on all intuitive class relationships, and so was polymorphism.
Technologies:
C++, SDL 2.0 and extension libraries (SDL_Mixer and SDL_Image), Photoshop, Illustrator.
Demo: Video Code: GitHub


Runway Radio - Comprehensive IT System

Description:
Runway Radio is Habib University's first and only student-run radio network. It operates over Habib's intranet and serves as a platform for students from all departments to hone their communication and verbal skills and engage in an array of intellectual and stimulating conversations. The IT system for the radio includes a Portable Web App (PWA), a website, and an internal ShoutCast server for live broadcasting/streaming.
Technologies:
Wordpress, Shoutcast, WAMP stack.
Demo: Website Code: GitHub