Flag Counter

Yiwei Ma   马祎炜

Second-year Ph. D Student at Xiamen University


Email: yiweima@stu.xmu.edu.cn
             [Github]   [Google Scholar]

[Biography] [Latest News] [Publications] [Projects] [Major Awards] [Patent] [Professional Activities]

Biography   [back top]

I am currently a second-year Ph.D student in Department of Artificial Intelligence, School of Informatics, Xiamen University, advised by Prof. Rongrong Ji and Prof. Xiaoshuai Sun .

My recent research interests are in (2D/3D) vision-and-language learning and AIGC.

Publications   [back top]

Journal

Yiwei Ma, Jiayi Ji, Xiaoshuai Sun, Yiyi Zhou, Xiaopeng Hong, Yongjian Wu, Rongrong Ji
Image Captioning via Dynamic Path Customization
IEEE Transactions on Neural Networks and Learning System (TNNLS), 2024
[PDF] [ArXiv] [Code]
Yiwei Ma, Yijun Fan, Jiayi Ji, Haowei Wang, Xiaoshuai Sun, Guannan Jiang, Annan Shu, Rongrong Ji
X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation
ACM Transactions on Multimedia Computing, Communications, and Applications (ToMM), 2024
[arXiv] [Code] [Project Page]
Yiwei Ma, Jiayi Ji, Xiaoshuai Sun, Yiyi Zhou, Rongrong Ji
Towards Local Visual Modeling for Image Captioning
Pattern Recognition (PR), 2023
[PDF] [ArXiv] [Code]
Yiwei Ma, Jiayi Ji, Xiaoshuai Sun, Yiyi Zhou, Yongjian Wu, Feiyue Huang, Rongrong Ji
Knowing what it is: Semantic-enhanced Dual Attention Transformer
IEEE Transactions on Multimedia (TMM), 2022
[PDF] [Code]
Jiayi Ji, Yiwei Ma (co-frist author), Xiaoshuai Sun, Yiyi Zhou, Yongjian Wu, Rongrong Ji
Knowing What to Learn: A Metric-oriented Focal Mechanism for Image Captioning
IEEE Transactions on Image Processing (TIP), 2022
[PDF] [Code]

Conference

Yiwei Ma, Jiayi Ji, Ke Ye, Weihuang Lin, Zhibin Wang, Yonghan Zheng, Qiang Zhou, Xiaoshuai Sun, Rongrong Ji
I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing
Conference on Neural Information Processing Systems (NeurIPS), 2024
[arXiv] [Code]
Yiwei Ma, Zhekai Lin, Jiayi Ji, Yijun Fan, Xiaoshuai Sun, Rongrong Ji
X-Oscar: A Progressive Framework for High-quality Text-guided 3D Animatable Avatar Generation
International Conference on Machine Learning (ICML), 2024
[arXiv] [Code] [Project Page]
Yiwei Ma, Xiaoqing Zhang, Xiaoshuai Sun, Jiayi Ji, Haowei Wang, Guannan Jiang, Weilin Zhuang, Rongrong Ji
X-Mesh:Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance
IEEE International Conference on Computer Vision (ICCV), 2023
[PDF] [arXiv] [Code] [Project Page]
Yiwei Ma, Guohai Xu, Xiaoshuai Sun, Ming Yan, Ji Zhang, Rongrong Ji
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval
ACM International Conference on Multimedia (ACM MM), 2022 (Cite: 200+)
[PDF] [arXiv] [Code] [Project Page]
Yiwei Ma, Xiaoshuai Sun, Jiayi Ji, Guannan Jiang, Weilin Zhuang, Rongrong Ji
Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval
ACM International Conference on Multimedia (ACM MM), 2023
[PDF] [Code] [Project Page]
Zhipeng Qian, Yiwei Ma (co-frist author), Zhekai Lin, Jiayi Ji, Xiawu Zheng, Xiaoshuai Sun, Rongrong Ji
Multi-branch Collaborative Learning Network for 3D Visual Grounding
European Conference on Computer Vision (ECCV), 2024
[arXiv] [Code]
Sihan Liu, Yiwei Ma (co-frist author), Xiaoqing Zhang, Haowei Wang, Jiayi Ji, Xiaoshuai Sun, Rongrong Ji
Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation
Computer Vision and Pattern Recognition Conference (CVPR), 2024
[arXiv] [Code]
Changli Wu, Yiwei Ma (co-frist author), Qi Chen, Haowei Wang, Gen Luo, Jiayi Ji, Xiaoshuai Sun
3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation
AAAI Conference on Artificial Intelligence (AAAI), 2024
[arXiv] [Code] [PDF]
Zhipeng Qian, Yiwei Ma (co-frist author), Jiayi Ji, Xiaoshuai Sun
X-RefSeg3D: Enhancing Referring 3D Instance Segmentation via Structured Cross-Modal Graph Neural Networks
AAAI Conference on Artificial Intelligence (AAAI), 2024
[Code] [PDF]
Changli Wu, Qi Chen, Haowei Wang, Yiwei Ma, You Huang, Gen Luo, Hao Fei, Jiayi Ji, Xiaoshuai Sun, Rongrong Ji
RG-SAN: Rule-Guided Spatial Awareness Network for End-to-End 3D Referring Expression Segmentation
Conference on Neural Information Processing Systems (NeurIPS), 2024 Oral (Top 0.46%)
Changli Wu, Yihang Liu, Jiayi Ji, Yiwei Ma, Haowei Wang, Gen Luo, Henghui Ding, Xiaoshuai Sun, Rongrong Ji
3D-GRES: Generalized 3D Referring Expression Segmentation
ACM International Conference on Multimedia (ACM MM), 2024, Oral (Top 3.97%)
[arXiv] [Code]
Danni Yang, Ruohan Dong, Jiayi Ji, Yiwei Ma, Haowei Wang, Xiaoshuai Sun, Rongrong Ji
Exploring Phrase-Level Grounding with Text-to-Image Diffusion Model
European Conference on Computer Vision (ECCV), 2024
[arXiv] [Code]
Danni Yang , Jiayi Ji, Yiwei Ma, Tianyu Guo, Haowei Wang, Xiaoshuai Sun, Rongrong Ji
SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation
International Conference on Machine Learning (ICML), 2024, Oral (Top 1.52%)
[arXiv] [Code]
Tianyu Guo, Haowei Wang, Yiwei Ma, Jiayi Ji , Xiaoshuai Sun
Improving Panoptic Narrative Grounding by Harnessing Semantic Relationships and Visual Confirmation
AAAI Conference on Artificial Intelligence (AAAI), 2024
[Code] [PDF]
Haowei Wang, Jiji Tang, Jiayi Ji, Xiaoshuai Sun, Rongsheng Zhang, Yiwei Ma, Minda Zhao, Lincheng Li, Zeng Zhao, Tangjie Lv, Rongrong Ji
Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation
ACM International Conference on Multimedia (ACM MM), 2023
[PDF] [arXiv] [Code]
Danni Yang, Jiayi Ji, Xiaoshuai Sun, Haowei Wang, Yinan Li, Yiwei Ma, Rongrong Ji
Semi-Supervised Panoptic Narrative Grounding
ACM International Conference on Multimedia (ACM MM), 2023
[PDF] [arXiv] [Code]

Preprint

Yiwei Ma, Zhibin Wang, Xiaoshuai Sun, Weihuang Lin, Qiang Zhou, Jiayi Ji, Rongrong Ji
INF-LLaVA: Dual-perspective Perception for High-Resolution Multimodal Large Language Model
arXiv preprint arXiv:2407.16198 , 2024
[arXiv] [Code]
Jiayi Ji, Haowei Wang, Changli Wu, Yiwei Ma, Xiaoshuai Sun, Rongrong Ji
JM3D & JM3D-LLM: Elevating 3D Representation with Joint Multi-modal Cues
arXiv preprint arXiv:2310.09503 , 2023
[arXiv] [Code]

Projects   [back top]


External-Attention-pytorch
Pytorch implementation of various Attention Mechanisms, MLP, Re-parameter, Convolution, which is helpful to further understand papers.
[github ] (11400+ stars)

Patent   [back top]

  • (已授权)一种基于文本信息的指向性3D实例分割方法 - 公开号: CN117634486A - [详情]
  • (已授权)一种基于链式感知的指向性3D实例分割方法 - 公开号: CN117593527A - [详情]
  • (已授权)利用文本到图像扩散模型实现短语级定位的方法 - 公开号: CN118247799A - [详情]
  • 基于多尺度特征交互和自适应旋转动态卷积的指向性遥感图像分割方法 - 公开号: CN117808826A - [详情]
  • 基于动态文本引导的文本驱动3D风格化方法 - 公开号: CN116704090A - [详情]
  • 面向视频文本检索的端到端多粒度对比学习方法 - 公开号: CN115757713A - [详情]
  • 面向局部视觉建模的图像描述生成方法 - 公开号: CN115964530A - [详情]
  • 基于文本的人物检索的双向一对多嵌入对齐方法 - 公开号: CN116304145A - [详情]
  • 一种3D内容创建方法 - 公开号: CN117593469A - [详情]
  • 基于文本和视觉上下文关系时间融合的视频文本检索方法 - 公开号: CN117407561A - [详情]
  • 一种面向指向性目标分割的半监督学习方法 - 公开号: CN117975241A - [详情]
  • 基于空间感知网络的三维指向性目标分割方法 - 公开号: CN118365659A - [详情]
  • Major Awards   [back top]

    Professional Activities   [back top]