Profile Picture

Maharnab Saikia

I love solving problems and building cool stuff with code. My interests include artificial intelligence, software development, computer graphics programming, and game development!

ATTR Paper Thumbnail

ATTR: A Transformer-Based Model for Unpaired Audio-to-Audio Translation

Abstract: Unpaired audio-to-audio translation aims to translate audio from a source domain to a target domain without paired training data. Cycle-Consistent Generative Adversarial Networks (CycleGAN) and Variational Autoencoders (VAE) have been used for this task, but these models suffer from difficult training and unsatisfactory results. Later, Contrastive Voice Conversion (CVC) was introduced, utilizing a contrastive learning-based approach to address these issues. However, these methods use CNN-based generators, which can capture local semantics but lack the ability to capture long-range dependencies necessary for global semantics. In this paper, we propose ATTR, an efficient method for unpaired audio-to-audio translation that leverages the Hybrid Perception Block (HPB) and Dual Pruned Self-Attention (DPSA) along with a contrastive learning-based adversarial approach.

AI Projects

Gemma-2-2b-it Project Thumbnail

Gemma-2-2b-it Fine-Tune (Hindi)

A fine-tuned version of Gemma 2 2b-it, specifically developed for Kaggle's 'Unlock Global Communication with Gemma' competition. It has been fine-tuned to handle language-specific tasks, with a primary focus on Hindi. The model is designed to enhance communication capabilities, enabling better understanding and processing of the Hindi language for a variety of applications. The model was fine-tuned on a corpus of 7,640 Hindi instructions, enabling it to better understand and process the language.

GPT-2 PyCode Project Thumbnail

GPT-2 PyCode

This project features a GPT (Generative Pre-trained Transformer) language model with 124 million parameters that has been fine-tuned for Python code generation. Unlike larger models like GPT-2 or GPT-3, this is a smaller-scale model designed primarily for testing and experimental purposes. It was trained on a small corpus of 25,000 Python code samples.

Flux Collage LoRA Project Thumbnail

Flux Collage LoRA

This model is a fine-tuned version of Flux.1-dev, optimized for generating collage-style images using LoRA (Low-Rank Adaptation).

Web Development

Glassmorphism Project Thumbnail

Glassmorphism

CSS code generator that generates the beautiful and trendy glassmorphism UI design style. Glassmorphism is a design trend that combines transparent elements, vibrant colors, and blurred backgrounds to create a visually appealing and modern user interface.

Placeholder Game 1 Thumbnail

Skybound

This is a small platformer game created as a learning project. Play as a brave knight on a mission to collect four magical fruits from different worlds to save your king. Dodge enemies like slimes, collect coins, and explore vibrant levels.