Software Rendering Part 1: Setting it all up

0 117
Avatar for Metalhead33
3 years ago
Topics: Software, 3D, Programming

This article is a follow up of my previous article on hardware-accelerated graphics being a mistake. In addition to writing a rudementary software-renderer, I will be demonstrating to the masses more or less how GPUs work under the hood, at least to a degree.

This article assumes that you already have some prior knowledge of the C++ language, specifically the C++11 variant, albeit some C++20 features might also be used. I will seek to produce readable code at the expense of performance.

What software are we using?

This software renderer is written in C++, specifically the C++20 variant, albeit we'll be mostly sticking to C++11 features. Discounting any libraries that import textures, or header-only libraries like GLM, the only external library we'll be relying on is SDL2. This article assumes you already know how to add dynamic linking libraries to projects.

So upon up your favourite C++ IDE, and start a new project. I'll be using QtCreator. What greets us, is obviously Hello World....

Hello, world!
#include <iostream>

using namespace std;

int main()
{
	cout << "Hello world!" < endl;
	return 0;
}

So, what we need to do is to start working. Now, you may feel tempted to write C-style code, where we simply do this:

#include <iostream>

using namespace std;
static const int WIDTH = 640;
static const int HEIGHT = 480;

int main()
{
	SDL_Init(SDL_INIT_VIDEO);
	SDL_Window* win = SDL_CreateWindow("Software Renderer Tutorial",0,0,WIDTH,HEIGHT,0);
	bool isInterrupted=false;
	do {
		SDL_Event ev;
		while(SDL_PollEvent(&ev)) {
			switch(ev.type) {
				case SDL_QUIT: isInterrupted = true; break;
				default: break;
			}
		}
	} while(!isInterrupted);
	return 0;
}

But this will lead to difficult-to-read code on the long run. Sure, it's a lot of up-front investment to do, but I recommend already starting with wrapping SDL constructs within C++ classes from the very start, and encapsulating everything.

A little bit of upfront investment

After a "little" bit of refactoring.

AppSystem.hpp
AppSystem.cpp
main.cpp
// AppSystem.hpp
#ifndef APPSYSTEM_HPP
#define APPSYSTEM_HPP
#include <memory>
#include <string>
#include <SDL2/SDL.h>

class AppSystem
{
public:
	typedef std::unique_ptr<SDL_Window,decltype(&SDL_DestroyWindow)> uWindow;
private:
	AppSystem(const AppSystem& cpy) = delete; // Disable copy constructor
	AppSystem& operator=(const AppSystem& cpy) = delete; // Disable copy assignment operator
protected:
	uWindow window;
public:
	AppSystem(AppSystem&& mov); // Move constructor
	AppSystem& operator=(AppSystem&& mov); // Move assignment operator
	// Regular constructors
	virtual ~AppSystem() = default;
	AppSystem(const char *title, int offsetX, int offsetY, int width, int height, Uint32 flags);
	AppSystem(const std::string& title, int offsetX, int offsetY, int width, int height, Uint32 flags);
	void run();
};

#endif // APPSYSTEM_HPP
#include "AppSystem.hpp"

AppSystem::AppSystem(AppSystem&& mov)
	: window(std::move(mov.window))
{

}
AppSystem& AppSystem::operator=(AppSystem&& mov)
{
	this->window = std::move(mov.window);
	return *this;
}

AppSystem::AppSystem(const char *title, int offsetX, int offsetY, int width, int height, Uint32 flags)
	: window(SDL_CreateWindow(title,offsetX,offsetY,width,height,flags),SDL_DestroyWindow)
{

}

AppSystem::AppSystem(const std::string &title, int offsetX, int offsetY, int width, int height, Uint32 flags)
	: window(SDL_CreateWindow(title.c_str(),offsetX,offsetY,width,height,flags),SDL_DestroyWindow)
{

}

void AppSystem::run()
{
	bool isInterrupted=false;
	do {
		SDL_Event ev;
		while(SDL_PollEvent(&ev)) {
			switch(ev.type) {
				case SDL_QUIT: isInterrupted = true; break;
				default: break;
			}
		}
	} while(!isInterrupted);
}

There, much cleaner. In fact, I think we can go a little bit further...

void AppSystem::processEvent(const SDL_Event &ev, bool &causesExit) 
{
	switch(ev.type) {
	case SDL_QUIT: causesExit = true; break;
	default: break;
	}
}
void AppSystem::updateLogic()
{
	// Pls implement me
}
void AppSystem::render()
{
	// Pls implement me
}

void AppSystem::run()
{
	bool isInterrupted=false;
	do {
		SDL_Event ev;
		while(SDL_PollEvent(&ev)) {
			processEvent(ev,isInterrupted);
		}
		updateLogic();
		render();
	} while(!isInterrupted);
}

So, what did we actually do here?

We applied RIIA onto the SDL_Window, which meant that we no longer have to manually create it or destroy it. We also split the initial main loop into three clearly separate functions, which allow us to cleanly reimplement them when needed.

Effectively, we have a run() function that contains our loop, which keeps running processEvent(), updateLogic() and render() in that particular order until some event causes the program to quit.

So, when are we going to render triangles?

We'll get there in time. Be patient, young child.

As you can see, this code so far doesn't do much so far. It creates a window, but doesn't even fill it with anything. So, before we proceed, I'm going to refactor just three things.

AppSystem is now an abstract class, with three pure virtual functions. All three of them get called within "run".
class AppSystem
{
public:
	typedef std::unique_ptr<SDL_Window,decltype(&SDL_DestroyWindow)> uWindow;
private:
	AppSystem(const AppSystem& cpy) = delete; // Disable copy constructor
	AppSystem& operator=(const AppSystem& cpy) = delete; // Disable copy assignment operator
protected:
	uWindow window;
	virtual void processEvent(const SDL_Event& ev, bool& causesExit) = 0;
	virtual void updateLogic() = 0;
	virtual void render() = 0;
public:
	// Regular constructors
	virtual ~AppSystem() = default;
	AppSystem(const char *title, int offsetX, int offsetY, int width, int height, Uint32 flags);
	AppSystem(const std::string& title, int offsetX, int offsetY, int width, int height, Uint32 flags);
	void run();
};

I'm fully aware that inheritance and the usage of virtual functions are highly frowned upon these days in the programming community, but this will aid code readability.

So, now we're going to create a new class that subclasses AppSystem. I'll call it SoftwareRendererSystem.

SoftwareRendererSystem.hpp
SoftwareRendererSystem.cpp
main.cpp
#ifndef SOFTWARERENDERERSYSTEM_HPP
#define SOFTWARERENDERERSYSTEM_HPP
#include "AppSystem.hpp"
#include "StandardPixelType.hpp"

class SoftwareRendererSystem : public AppSystem
{
public:
	typedef std::unique_ptr<SDL_Texture,decltype(&SDL_DestroyTexture)> uSdlTexture;
	typedef std::unique_ptr<SDL_Renderer,decltype(&SDL_DestroyRenderer)> uSdlRenderer;
protected:
	uSdlRenderer renderer;
	uSdlTexture framebuffer;
	void processEvent(const SDL_Event& ev, bool& causesExit);
	void updateLogic();
	void render();
public:
	SoftwareRendererSystem(int width, int height);
};

#endif // SOFTWARERENDERERSYSTEM_HPP
#include "SoftwareRendererSystem.hpp"

SoftwareRendererSystem::SoftwareRendererSystem(int width, int height)
	: AppSystem("Software Renderer Demo",0,0,width,height,0),
	renderer(SDL_CreateRenderer(this->window.get(),0,0),SDL_DestroyRenderer),
	framebuffer(SDL_CreateTexture(this->renderer.get(), SDL_PIXELFORMAT_RGBA8888, SDL_TEXTUREACCESS_STREAMING, width,height),SDL_DestroyTexture)
{

}

void SoftwareRendererSystem::processEvent(const SDL_Event &ev, bool &causesExit)
{
	switch(ev.type) {
	case SDL_QUIT: causesExit = true; break;
	default: break;
	}
}

void SoftwareRendererSystem::updateLogic()
{

}

void SoftwareRendererSystem::render()
{
	SDL_RenderCopy(renderer.get(), framebuffer.get(), nullptr, nullptr);
	SDL_RenderPresent(renderer.get());
}
#include <iostream>
#include "SoftwareRendererSystem.hpp"


using namespace std;
static const int WIDTH = 640;
static const int HEIGHT = 480;

int main()
{
	SDL_Init(SDL_INIT_VIDEO);
	SoftwareRendererSystem app(WIDTH,HEIGHT);
	app.run();
	return 0;
}

And with this, we are finally producing something...

But I think I owe you all an explanation on what just happened, or rather, what is all this new code. What does SDL_TEXTUREACCESS_STREAMING event mean?!

Well, first of all, the SDL_Renderer takes care of presenting our texture on the screen. The SDL_Texture is, as its name says, a texture, the one that we'll be putting onto the screen, which makes it actually a framebuffer. We use SDL_RenderCopy() to ensure that the renderer will present the texture, then SDL_RenderPresent() to put it onto the screen. Just like with the Window, we are applying RIIA onto the Renderer and Texture. SDL_TEXTUREACCESS_STREAMING simply means that we'll be modifying the texture very often - like, uh, every single frame?

However, this black is obviously rather boring, so we'll need to find a way to put colour onto the screen. We'll need another texture - one we can access directly - to store pixel data we are currently modifying, and then we can run SDL_UpdateTexture() to update our framebuffer from it, effectively creating a form of double-buffering.

But first we need to do more up-front work!

At this point, you might be pulling out your own hair, at the amount of work we have to do perform just to put some stinking pixels onto the screen, but all the up-front investment will pay off in the end.

Texture.hpp
#ifndef TEXTURE_HPP
#define TEXTURE_HPP
#include <glm/glm.hpp>

class Texture
{
public:
	virtual ~Texture() = default;
	// Data getters
	virtual int getWidth() const = 0;
	virtual float getWidthF() const = 0;
	virtual int getHeight() const = 0;
	virtual float getHeightF() const = 0;
	virtual int getStride() const = 0;
	// Pixel manipulation
	virtual void getPixel(const glm::ivec2& pos, glm::vec4& colourKernel) const = 0;
	inline glm::vec4 getPixel(const glm::ivec2& pos) const {
		glm::vec4 tmp;
		getPixel(pos,tmp);
		return tmp;
	}
	virtual void setPixel(const glm::ivec2& pos, const glm::vec4& colourKernel) = 0;
	virtual void* getRawPixels() = 0;
	virtual const void* getRawPixels() const = 0;
};

#endif // TEXTURE_HPP

Now, once again, I'm fully aware that I am making a million anti-OOP programmers scream in agony, as I am declaring an abstract class with pure virtual functions with the full intent of having other classes inherit it, but trust me, it'll improve code readability and re-usability. If performance was the main concern, I'd be probably relying on templates.

In fact, speaking of templates...

StandardTexture.hpp
#ifndef STANDARDTEXTURE_HPP
#define STANDARDTEXTURE_HPP
#include "Texture.hpp"
#include <vector>

template <typename PixelType> class StandardTexture : public Texture
{
private:
	std::vector<PixelType> buff;
	int w,h,stride;
	float fw, fh;
public:
	StandardTexture(int w, int h) : w(w), h(h), stride(w*sizeof(PixelType)), fw(w), fh(h), buff(w*h) {

	}
	int getWidth() const { return w; }
	float getWidthF() const { return fw; }
	int getHeight() const { return h; }
	float getHeightF() const { return fw; }
	int getStride() const { return stride; }
	// Pixel manipulation
	void getPixel(const glm::ivec2& pos, glm::vec4& colourKernel) const {
		const PixelType& pix = buff[((pos.y % h) * w) + (pos.x % w)];
		pix.fillKernel(colourKernel);
	}
	virtual void setPixel(const glm::ivec2& pos, const glm::vec4& colourKernel){
		PixelType& pix = buff[((pos.y % h) * w) + (pos.x % w)];
		pix.fromKernel(colourKernel);
	}
	void* getRawPixels() { return buff.data(); }
	const void* getRawPixels() const  { return buff.data(); }
};

#endif // STANDARDTEXTURE_HPP

With this trusty little template, we can easily wrap around any pixel format, as long as we feed it a struct that contains the fillKernel and fromKernel functions, both taking a reference to a glm::vec4. So, let's implement one.

#ifndef STANDARDPIXELTYPE_HPP
#define STANDARDPIXELTYPE_HPP
#include "StandardTexture.hpp"
#include <cstdint>

struct PixelRgba8 {
	uint32_t rgba;
	static constexpr const float reciprocal = 1.0f / 255.0f;
	inline void fillKernel(glm::vec4& colourKernel) const {
		colourKernel.r = float(rgba >> 24 & 0xFF) * reciprocal;
		colourKernel.g = float(rgba >> 16 & 0xFF) * reciprocal;
		colourKernel.b = float(rgba >> 8 & 0xFF) * reciprocal;
		colourKernel.a = float(rgba  & 0xFF) * reciprocal;
	}
	inline void fromKernel(const glm::vec4& colourKernel) {
		rgba = (uint32_t(colourKernel.r*255.0f) << 24) +
				(uint32_t(colourKernel.g*255.0f) << 16) +
				(uint32_t(colourKernel.b*255.0f) << 8) + uint32_t(colourKernel.a*255.0f);
	}
};
typedef StandardTexture<PixelRgba8> TextureRgba8;

#endif // STANDARDPIXELTYPE_HPP

Okay, so now we can have textures of 8-bit RGBA type, that we can directly modify. I'm not going to get into details on how to manipulate pixels bitwise, there are already a million articles on that - if it's highly requested of me, I might get into dithering though.

So anyway... What now?
Well, we put it into our renderer!

#ifndef SOFTWARERENDERERSYSTEM_HPP
#define SOFTWARERENDERERSYSTEM_HPP
#include "AppSystem.hpp"
#include "StandardPixelType.hpp"

class SoftwareRendererSystem : public AppSystem
{
public:
	typedef std::unique_ptr<SDL_Texture,decltype(&SDL_DestroyTexture)> uSdlTexture;
	typedef std::unique_ptr<SDL_Renderer,decltype(&SDL_DestroyRenderer)> uSdlRenderer;
protected:
	uSdlRenderer renderer;
	uSdlTexture framebuffer;
	TextureRgba8 renderBuffer;
	void processEvent(const SDL_Event& ev, bool& causesExit);
	void updateLogic();
	void render();
public:
	SoftwareRendererSystem(int width, int height);
};

#endif // SOFTWARERENDERERSYSTEM_HPP
#include "SoftwareRendererSystem.hpp"


SoftwareRendererSystem::SoftwareRendererSystem(int width, int height)
	: AppSystem("Software Renderer Demo",0,0,width,height,0),
	renderer(SDL_CreateRenderer(this->window.get(),0,0),SDL_DestroyRenderer),
	framebuffer(SDL_CreateTexture(this->renderer.get(), SDL_PIXELFORMAT_RGBA8888, SDL_TEXTUREACCESS_STREAMING, width,height),SDL_DestroyTexture),
	renderBuffer(width,height)
{

}

void SoftwareRendererSystem::processEvent(const SDL_Event &ev, bool &causesExit)
{
	switch(ev.type) {
	case SDL_QUIT: causesExit = true; break;
	default: break;
	}
}

void SoftwareRendererSystem::updateLogic()
{

}

void SoftwareRendererSystem::render()
{
	for(int x = 0; x < renderBuffer.getWidth(); ++x) {
		for(int y = 0; y < renderBuffer.getHeight(); ++y) {
			const float r = float(x) / renderBuffer.getWidthF();
			const float g = float(y) / renderBuffer.getHeightF();
			renderBuffer.setPixel(glm::ivec2(x,y),glm::vec4(r,g,0.0f,1.0f));
		}
	}
	SDL_UpdateTexture(framebuffer.get(), nullptr, renderBuffer.getRawPixels(), renderBuffer.getStride() );
	SDL_RenderCopy(renderer.get(), framebuffer.get(), nullptr, nullptr);
	SDL_RenderPresent(renderer.get());
}

Okay, so, explanation time. What does this code do now?

The first thing that warrants explanation is the SDL_UpdateTexture() part. It copies the raw bytes out of our render buffer, and copies them into the framebuffer. So basically, we have a double-buffering of sorts going on.

The loop above it is a a quick demonstration, where we basically draw a gradient that looks sorta like...

Yeah, like this.

But when are we going to render triangles?!

Be patient, my child. We just took the first great step towards implementing software-rendering, manipulating pixels that are going to appear on the screen.

Still, if you are THAT impatient, you get a sneak peak from the next episode:

#ifndef RENDERINGPIPELINE_HPP
#define RENDERINGPIPELINE_HPP
#include <functional>
#include <glm/glm.hpp>
#include "Texture.hpp"

template<typename VertexInType, typename VertexOutType, typename UniformType> struct RenderingPipeline {
	typedef std::function<VertexOutType(const UniformType&, const VertexInType&, const glm::ivec4&)> VertexShader;
	typedef std::function<void(Texture&, const VertexOutType&, const VertexOutType&, const VertexOutType&, float, float, float)> FragmentShader;
	void rasterize(const UniformType& uniform,
				   const VertexOutType& v0, const VertexOutType& v1, const VertexOutType& v2,
				   Texture& framebuffer, const FragmentShader& frag) {
		// TO BE IMPLEMENTED
	}
	void renderTriangle(const UniformType& uniform, const VertexShader& vert, const FragmentShader& frag,
						const glm::ivec4& viewport, Texture& framebuffer,
						const VertexInType& i0, const VertexInType& i1, const VertexInType& i2) {
		const VertexOutType o0 = vert(uniform,i0,viewport);
		const VertexOutType o1 = vert(uniform,i1,viewport);
		const VertexOutType o2 = vert(uniform,i2,viewport);
		rasterize(uniform,o0,o1,o2,framebuffer,frag);
	}
	void renderTriangles(const UniformType& uniform, const VertexShader& vert, const FragmentShader& frag,
						const glm::ivec4& viewport, Texture& framebuffer,
						const VertexInType* vertices, size_t vertexCount) {
		for(size_t i = 0; i < vertexCount; i += 3) {
			renderTriangle(uniform,vert,frag,viewport,framebuffer,
						   vertices[i],vertices[i+1],vertices[i+2]);
		}
	}
	void renderTriangles(const UniformType& uniform, const VertexShader& vert, const FragmentShader& frag,
						const glm::ivec4& viewport, Texture& framebuffer,
						const VertexInType* vertices, unsigned* indices, size_t indexCount) {
		for(size_t i = 0; i < indexCount; i += 3) {
			renderTriangle(uniform,vert,frag,viewport,framebuffer,
						   vertices[indices[i]],vertices[indices[i+1]],vertices[indices[i+2]]);
		}
	}
};

#endif // RENDERINGPIPELINE_HPP

All code is uploaded to Github... except the sneak peak.

1
$ 0.17
$ 0.17 from @TheRandomRewarder
Sponsors of Metalhead33
empty
empty
empty
Avatar for Metalhead33
3 years ago
Topics: Software, 3D, Programming

Comments