In 2014, Ian Goodfellow, a machine learning researcher, co-authored the first paper on Generative Adversarial Networks with his colleagues; in this paper, he describes the GAN architecture's simplification. Which is a stack of two neural networks, a generator, and a discriminator in a nutshell. This paper contains two networks, one of which attempts to generate images and the other of which attempts to determine which images are real and which are not. For instance, if we want to generate dogs, we must first learn what a dog is. The generator begins synthesizing dogs with the learned characteristics of dogs, but the discriminators' task is to determine whether the generated images are of dogs. The initial iterations are very easy to distinguish, and all of this information (also known as weights) is then returned to the generator, instructing it on how to trick the discriminator. This is a continuous feedback loop that continues until the generator generates images that are virtually indistinguishable from real ones, allowing for the creation of extremely realistic images of, say, dogs.
Although the initial GAN architecture was immature and could only generate images of very low quality. Additional enhancements were being made to it. For instance, DCGAN, which was published in a paper in 2017, improved the image quality and image generation. Other aspects of architecture are improving as well, and images are becoming more realistic Daily. However, using GANs exclusively on images is not the only application; a large portion of the data we use in quality assurance is not in image format but rather in tabular format. How can we convert these GANs trained on images to tabular data? To begin, when we look at images mathematically, we only see pixels and the RGB codes that contain information about the pixels, which means that one image could be a table. Recently, researchers began investigating how these GAN architectures might be applied to traditional tabular data. Tabular data, on the other hand, imposes its own constraints. The first issue is that we have many distinct data types. We do not have a single type, such as integers, but also decimals, categories, and text. Time, names of dates, and so forth. Second, the distributions of these variables are quite dissimilar as well. Thirdly, many tables contain categorical variables with numerous options such as names, and even if you want to one-hot encode, you will end up with an enormous number of columns, making it impossible to apply a traditional GAN trained on image data to tabular data.