Generative Adversarial Networks (GANs) have been one of the main hypes of recent years. Based on the famous generator-discriminator mechanism, their very simple functioning has driven the research to continuously improve the former architecture. The peak in image generation has been reached by StyleGANs, which can produce astonishingly realistic and high-quality images, able to fool even humans.Â?
While the generation of new samples has achieved excellent results in the 2D domain, 3D GANs are still highly inefficient. If the exact mechanism of 2D GANs is applied in the 3D environment, the computational effort is too high since 3D data is tough to manipulate for current GPUs. For this reason, the research has focused on how to construct geometry-aware GANs that can infer the underline 3D property using solely 2D images. But, in this case, the approximations are usually not 3D consistent. Continue Reading The Paper Summary
Paper: https://arxiv.org/pdf/2112.07945.pdf
Project: https://matthew-a-chan.github.io/EG3D/
I have no idea if we are in a simulation, but I find it very lucky or strange, depending on how you look at it, but I was born on the cusp of such technology. If humans have really been around as long as evolution says that they’ve been around, what are the odds that I was born during this time? With this technology? During one of the most politically charged times ever recorded in human history? What are the odds that I would be born to witness the history and the technology that were staying right now? I think about that a lot. How lucky I got and being alive at this moment in time. Do you think that we are in a simulation? I’m genuinely curious.
If there is a training data pairs of image and corresponding textual explanation of that image, how to represent this image and corresponding textual meaning as form of knowledge in AI model. So that we don't need much of training data, if I can infuse that knowledge. Idea is to infuse this knowledge and then at inference time, if such image appears then perform action. Ex- If there is a image and respective meaning is 'open the door', then if robot sees that image at inference it will open the door. I am thinking of NLP based model to infuse this knowledge, but quite confused.
Is there existing SOTA which addresses this issue?