The slow down in Moore’s Law has resulted in poor scaling of performance and energy. This slow down in scaling has been accompanied by the explosive growth of cognitive computing applications, creating a demand for high performance and energy efficient solutions. Amidst this climate, FPGA-based accelerators are emerging as a potential platform for deploying accelerators for cognitive computing workloads. However, the slow-down in scaling also limits the scaling of memory and I/O bandwidths. Additionally, a growing fraction of energy is spent on data transfer between off-chip memory and the compute units. Thus, now more than ever, there is a need to leverage near-memory and in-storage computing to maximize the bandwidth available to accelerators, and further improve energy efficiency. In this paper, we make the case for leveraging FPGAs in near-memory and in-storage settings, and present opportunities and challenges in such scenarios. We introduce a conceptual FPGA-based near-data processing architecture, and discuss innovations in architecture, systems, and compilers for accelerating cognitive computing workloads..